Parser

Parser(DbConnection)

Initializes a new instance of the Parser class to extract data from a database.

public Parser(DbConnection connection)
Parameter Type Description
connection DbConnection The database connection.

Remarks

Learn more:

Examples

The following example shows how to extract data from Sqlite database:

// Create DbConnection object
DbConnection connection = new SQLiteConnection(string.Format("Data Source={0};Version=3;", Constants.SampleDatabase));
// Create an instance of Parser class to extract tables from the database
using (Parser parser = new Parser(connection))
{
    // Check if text extraction is supported
    if (!parser.Features.Text)
    {
        Console.WriteLine("Text extraction isn't supported.");
        return;
    }
    // Check if toc extraction is supported
    if (!parser.Features.Toc)
    {
        Console.WriteLine("Toc extraction isn't supported.");
        return;
    }
    // Get a list of tables
    IEnumerable<TocItem> toc = parser.GetToc();
    // Iterate over tables
    foreach (TocItem i in toc)
    {
        // Print the table name
        Console.WriteLine(i.Text);
        // Extract a table content as a text
        using (TextReader reader = parser.GetText(i.PageIndex.Value))
        {
            Console.WriteLine(reader.ReadToEnd());
        }
    }
}

See Also


Parser(DbConnection, ParserSettings)

Initializes a new instance of the Parser class to extract data from a database.

public Parser(DbConnection connection, ParserSettings parserSettings)
Parameter Type Description
connection DbConnection The database connection.
parserSettings ParserSettings The parser settings which are used to customize data extraction.

Remarks

Learn more:

Examples

The following example shows how to extract data from Sqlite database:

// Create DbConnection object
DbConnection connection = new SQLiteConnection(string.Format("Data Source={0};Version=3;", Constants.SampleDatabase));
// Create an instance of Parser class to extract tables from the database
using (Parser parser = new Parser(connection))
{
    // Check if text extraction is supported
    if (!parser.Features.Text)
    {
        Console.WriteLine("Text extraction isn't supported.");
        return;
    }
    // Check if toc extraction is supported
    if (!parser.Features.Toc)
    {
        Console.WriteLine("Toc extraction isn't supported.");
        return;
    }
    // Get a list of tables
    IEnumerable<TocItem> toc = parser.GetToc();
    // Iterate over tables
    foreach (TocItem i in toc)
    {
        // Print the table name
        Console.WriteLine(i.Text);
        // Extract a table content as a text
        using (TextReader reader = parser.GetText(i.PageIndex.Value))
        {
            Console.WriteLine(reader.ReadToEnd());
        }
    }
}

See Also


Parser(EmailConnection)

Initializes a new instance of the Parser class to extract data from a remote email server.

public Parser(EmailConnection connection)
Parameter Type Description
connection EmailConnection The email connection.

Remarks

Learn more:

Examples

The following example shows how to extract emails from Exchange Server:

// Create the connection object for Exchange Web Services protocol 
EmailConnection connection = new EmailEwsConnection(
    "https://outlook.office365.com/ews/exchange.asmx",
    "email@server",
    "password");
 
// Create an instance of Parser class to extract emails from the remote server
using (Parser parser = new Parser(connection))
{
    // Check if container extraction is supported
    if (!parser.Features.Container)
    {
        Console.WriteLine("Container extraction isn't supported.");
        return;
    }

// Extract email messages from the server
IEnumerable<ContainerItem> emails = parser.GetContainer();
 
    // Iterate over attachments
    foreach (ContainerItem item in emails)
    {
        // Create an instance of Parser class for email message
        using (Parser emailParser = item.OpenParser())
        {
            // Extract the email text
            using (TextReader reader = emailParser.GetText())
            {
                // Print the email text
                Console.WriteLine(reader == null ? "Text extraction isn't supported." : reader.ReadToEnd());
            }
        }
    }
}   

See Also


Parser(EmailConnection, ParserSettings)

Initializes a new instance of the Parser class to extract data from a remote email server.

public Parser(EmailConnection connection, ParserSettings parserSettings)
Parameter Type Description
connection EmailConnection The email connection.
parserSettings ParserSettings The parser settings which are used to customize data extraction.

Remarks

Learn more:

Examples

The following example shows how to extract emails from Exchange Server:

// Create the connection object for Exchange Web Services protocol 
EmailConnection connection = new EmailEwsConnection(
    "https://outlook.office365.com/ews/exchange.asmx",
    "email@server",
    "password");
 
// Create an instance of Parser class to extract emails from the remote server
using (Parser parser = new Parser(connection))
{
    // Check if container extraction is supported
    if (!parser.Features.Container)
    {
        Console.WriteLine("Container extraction isn't supported.");
        return;
    }

// Extract email messages from the server
IEnumerable<ContainerItem> emails = parser.GetContainer();
 
    // Iterate over attachments
    foreach (ContainerItem item in emails)
    {
        // Create an instance of Parser class for email message
        using (Parser emailParser = item.OpenParser())
        {
            // Extract the email text
            using (TextReader reader = emailParser.GetText())
            {
                // Print the email text
                Console.WriteLine(reader == null ? "Text extraction isn't supported." : reader.ReadToEnd());
            }
        }
    }
}   

See Also


Parser(string)

Initializes a new instance of the Parser class.

public Parser(string filePath)
Parameter Type Description
filePath String The path to the file.

Remarks

Learn more:

Examples

The following example shows how to load the document from the local disk:

// Create an instance of Parser class with the filePath
using (Parser parser = new Parser(filePath))
{
    // Extract a text into the reader
    using (TextReader reader = parser.GetText())
    {
        // Print a text from the document
        // If text extraction isn't supported, a reader is null
        Console.WriteLine(reader == null ? "Text extraction isn't supported" : reader.ReadToEnd());
    }
}

See Also


Parser(string, LoadOptions)

Initializes a new instance of the Parser class with LoadOptions.

public Parser(string filePath, LoadOptions loadOptions)
Parameter Type Description
filePath String The path to the file.
loadOptions LoadOptions The options to open the file.

Remarks

Learn more:

Examples

The document password is passed by LoadOptions class:

try
{
    // Create an instance of Parser class with the password:
    using (Parser parser = new Parser(filePath, new LoadOptions(password)))
    {
        // Check if text extraction is supported
        if (!parser.Features.Text)
        {
            Console.WriteLine("Text extraction isn't supported.");
            return;
        }
        // Print the document text
        using (TextReader reader = parser.GetText())
        {
            Console.WriteLine(reader.ReadToEnd());
        }
    }
}
catch (InvalidPasswordException)
{
    // Print the message if the password is incorrect or empty
    Console.WriteLine("Invalid password");
}

See Also


Parser(string, ParserSettings)

Initializes a new instance of the Parser class with ParserSettings.

public Parser(string filePath, ParserSettings parserSettings)
Parameter Type Description
filePath String The path to the file.
parserSettings ParserSettings The parser settings which are used to customize data extraction.

See Also


Parser(string, LoadOptions, ParserSettings)

Initializes a new instance of the Parser class with LoadOptions and ParserSettings.

public Parser(string filePath, LoadOptions loadOptions, ParserSettings parserSettings)
Parameter Type Description
filePath String The path to the file.
loadOptions LoadOptions The options to open the file.
parserSettings ParserSettings The parser settings which are used to customize data extraction.

Remarks

Learn more:

Examples

The following example shows how to receive the information via ILogger interface:

// try
{
    // Create an instance of Logger class
    Logger logger = new Logger();
    // Create an instance of Parser class with the parser settings
    using (Parser parser = new Parser(filePath, null, new ParserSettings(logger)))
    {
        // Check if text extraction is supported
        if (!parser.Features.Text)
        {
            Console.WriteLine("Text extraction isn't supported.");
            return;
        }
        // Print the document text
        using (TextReader reader = parser.GetText())
        {
            Console.WriteLine(reader.ReadToEnd());
        }
    }
}
catch (InvalidPasswordException)
{
    ; // Ignore the exception
}
 
private class Logger : ILogger
{
    public void Error(string message, Exception exception)
    {
        // Print error message
        Console.WriteLine("Error: " + message);
    }
    public void Trace(string message)
    {
        // Print event message
        Console.WriteLine("Event: " + message);
    }
    public void Warning(string message)
    {
        // Print warning message
        Console.WriteLine("Warning: " + message);
    }
}

See Also


Parser(Stream)

Initializes a new instance of the Parser class.

public Parser(Stream document)
Parameter Type Description
document Stream The source input stream.

Remarks

Learn more:

Examples

The following example shows how to load the document from the stream:

// Create an instance of Parser class with the stream
using (Parser parser = new Parser(stream))
{
    // Extract a text into the reader
    using (TextReader reader = parser.GetText())
    {
        // Print a text from the document
        // If text extraction isn't supported, a reader is null
        Console.WriteLine(reader == null ? "Text extraction isn't supported" : reader.ReadToEnd());
    }
}

See Also


Parser(Stream, LoadOptions)

Initializes a new instance of the Parser class with LoadOptions.

public Parser(Stream document, LoadOptions loadOptions)
Parameter Type Description
document Stream The source input stream.
loadOptions LoadOptions The options to open the file.

Remarks

Learn more:

Examples

In some cases it’s necessary to define FileFormat. Both for special cases (databases, email server) and for detecting file types by the content:

// Create an instance of Parser class for markdown document
using (Parser parser = new Parser(stream, new LoadOptions(FileFormat.Markup)))
{
    // Check if text extraction is supported
    if (!parser.Features.Text)
    {
        Console.WriteLine("Text extraction isn't supported.");
        return;
    }
    using (TextReader reader = parser.GetText())
    {
        // Print the document text
        // Markdown is detected; text without special symbols is printed
        Console.WriteLine(reader.ReadToEnd());
    }
}

The document password is passed by LoadOptions class:

try
{
    // Create an instance of Parser class with the password:
    using (Parser parser = new Parser(filePath, new LoadOptions(password)))
    {
        // Check if text extraction is supported
        if (!parser.Features.Text)
        {
            Console.WriteLine("Text extraction isn't supported.");
            return;
        }
        // Print the document text
        using (TextReader reader = parser.GetText())
        {
            Console.WriteLine(reader.ReadToEnd());
        }
    }
}
catch (InvalidPasswordException)
{
    // Print the message if the password is incorrect or empty
    Console.WriteLine("Invalid password");
}

See Also


Parser(Stream, ParserSettings)

Initializes a new instance of the Parser class with ParserSettings.

public Parser(Stream document, ParserSettings parserSettings)
Parameter Type Description
filePath Stream The path to the file.
parserSettings ParserSettings The parser settings which are used to customize data extraction.

See Also


Parser(Stream, LoadOptions, ParserSettings)

Initializes a new instance of the Parser class with LoadOptions and ParserSettings.

public Parser(Stream document, LoadOptions loadOptions, ParserSettings parserSettings)
Parameter Type Description
document Stream The source input stream.
loadOptions LoadOptions The options to open the file.
parserSettings ParserSettings The parser settings which are used to customize data extraction.

Remarks

Learn more:

Examples

The following example shows how to receive the information via ILogger interface:

// try
{
    // Create an instance of Logger class
    Logger logger = new Logger();
    // Create an instance of Parser class with the parser settings
    using (Parser parser = new Parser(stream, null, new ParserSettings(logger)))
    {
        // Check if text extraction is supported
        if (!parser.Features.Text)
        {
            Console.WriteLine("Text extraction isn't supported.");
            return;
        }
        // Print the document text
        using (TextReader reader = parser.GetText())
        {
            Console.WriteLine(reader.ReadToEnd());
        }
    }
}
catch (InvalidPasswordException)
{
    ; // Ignore the exception
}
 
private class Logger : ILogger
{
    public void Error(string message, Exception exception)
    {
        // Print error message
        Console.WriteLine("Error: " + message);
    }
    public void Trace(string message)
    {
        // Print event message
        Console.WriteLine("Event: " + message);
    }
    public void Warning(string message)
    {
        // Print warning message
        Console.WriteLine("Warning: " + message);
    }
}

See Also