Parser class

Parser class

Represents the main class that controls text, images, container extraction and parsing functionality.

The Parser type exposes the following members:

Constructors

Constructor Description
init Initializes a new instance of the Parser class to extract data from an URI.
init Initializes a new instance of the Parser class to extract data from an URI with load_options.
init Initializes a new instance of the Parser class to extract data from an URI with parser_settings.
init Initializes a new instance of the Parser class to extract data from an URI with load_options and parser_settings.
init Initializes a new instance of the Parser class to extract data from a remote email server.
init Initializes a new instance of the Parser class to extract data from a remote email server.
init Initializes a new instance of the Parser class.
init Initializes a new instance of the Parser class with LoadOptions.
init Initializes a new instance of the Parser class with ParserSettings.
init Initializes a new instance of the Parser class with LoadOptions
and ParserSettings.
init Initializes a new instance of the Parser class.
init Initializes a new instance of the Parser class with LoadOptions.
init Initializes a new instance of the Parser class with ParserSettings.
init Initializes a new instance of the Parser class with LoadOptions
and ParserSettings.

Properties

Property Description
features Gets the supported features.

Methods

Method Description
get_file_info Returns the general information about a file.
get_file_info Returns the general information about a file.
get_file_info Returns the general information about a file.
get_file_info Returns the general information about a file.
get_page_preview Generates a document page preview.
get_page_preview Generates a document page preview using customization options.
get_text Extracts a text from the document.
get_text Extracts a text page from the document using text options (to enable raw fast text extraction mode).
get_text Extracts a text from the document page.
get_text Extracts a text from the document page using text options (to enable raw fast text extraction mode).
get_formatted_text Extracts a formatted text from the document.
get_formatted_text Extracts a formatted text from the document page.
search Searches a keyword in the document.
search Searches a keyword in the document using search options (regular expression, match case, etc.).
get_text_areas Extracts text areas from the document.
get_text_areas Extracts text areas from the document using customization options (regular expression, match case, etc.).
get_text_areas Extracts text areas from the document page.
get_text_areas Extracts text areas from the document page using customization options (regular expression, match case, etc.).
get_images Extracts images from the document.
get_images Extracts images from the document using customization options
(to set the rectangular area that contains images).
get_images Extracts images from the document page.
get_images Extracts images from the document page using customization options
(to set the rectangular area that contains images).
get_hyperlinks Extracts hyperlinks from the document.
get_hyperlinks Extracts hyperlinks from the document page.
get_hyperlinks Extracts hyperlinks from the document using customization options
(to set the rectangular area that contains hyperlinks).
get_hyperlinks Extracts hyperlinks from the document page using customization options
(to set the rectangular area that contains hyperlinks).
get_barcodes Extracts barcodes from the document.
get_barcodes Extracts barcodes from the document page.
get_barcodes Extracts barcodes from the document using customization options
(to set the rectangular area that contains barcodes).
get_barcodes Extracts barcodes from the document using customization options.
get_barcodes Extracts barcodes from the document page using customization options
(to set the rectangular area that contains barcodes).
get_barcodes Extracts barcodes from the document page using customization options.
get_tables Extracts tables from the document.
get_tables Extracts tables from the document page.
get_tables Extracts tables from the document.
get_tables Extracts tables from the document page.
get_worksheet_info Extracts the info about all worksheets in the spreadsheet.
get_worksheet_info Extracts the info about the worksheet.
get_worksheet_cells Extracts worksheet cells.
get_worksheet_cells Extracts worksheet cells using customization options.
parse_by_template Parses the document by the user-generated template.
parse_by_template Parses the document by the user-generated template using customization options.
parse_by_template Selects the most suitable template from the provided collection and then parses the document against the selected template.
parse_pages_by_template Parses the document pages by the user-generated template.
parse_pages_by_template Parses the document pages by the user-generated template using customization options.
generate_preview Get pages preview.
get_document_info Returns the general information about the document.
get_highlight Extracts a highlight from the document.
get_toc Extracts a table of contents from the document.
get_container Extracts a container object from the document to work with formats that contain attachments, ZIP archives etc.
get_metadata Extracts metadata from the document.
generate_adjustment_fields Generates fields for automatic adjustment of template position and scale.
parse_form Parses the document form.

See Also