Parser class
Leave feedback
On this page
Represents the main class that controls text, images, container extraction and parsing functionality.
The Parser type exposes the following members:
| Constructor | Description |
|---|---|
| init | Initializes a new instance of the Parser class. |
| init | Initializes a new instance of the Parser class with LoadOptions. |
| init | Initializes a new instance of the Parser class with ParserSettings. |
| init | Initializes a new instance of the Parser class with LoadOptions
and ParserSettings. |
| init | Initializes a new instance of the Parser class. |
| init | Initializes a new instance of the Parser class with LoadOptions. |
| init | Initializes a new instance of the Parser class with ParserSettings. |
| init | Initializes a new instance of the Parser class with LoadOptions
and ParserSettings. |
| Property | Description |
|---|---|
| features | Gets the supported features. |
| Method | Description |
|---|---|
| get_file_info | Returns the general information about a file. |
| get_file_info | Returns the general information about a file. |
| get_file_info | Returns the general information about a file. |
| get_file_info | Returns the general information about a file. |
| get_page_preview | Generates a document page preview. |
| get_page_preview | Generates a document page preview using customization options. |
| get_text | Extracts a text from the document. |
| get_text | Extracts a text page from the document using text options (to enable raw fast text extraction mode). |
| get_text | Extracts a text from the document page. |
| get_text | Extracts a text from the document page using text options (to enable raw fast text extraction mode). |
| get_formatted_text | Extracts a formatted text from the document. |
| get_formatted_text | Extracts a formatted text from the document page. |
| search | Searches a keyword in the document. |
| search | Searches a keyword in the document using search options (regular expression, match case, etc.). |
| get_text_areas | Extracts text areas from the document. |
| get_text_areas | Extracts text areas from the document using customization options (regular expression, match case, etc.). |
| get_text_areas | Extracts text areas from the document page. |
| get_text_areas | Extracts text areas from the document page using customization options (regular expression, match case, etc.). |
| get_images | Extracts images from the document. |
| get_images | Extracts images from the document using customization options
(to set the rectangular area that contains images). |
| get_images | Extracts images from the document page. |
| get_images | Extracts images from the document page using customization options
(to set the rectangular area that contains images). |
| get_hyperlinks | Extracts hyperlinks from the document. |
| get_hyperlinks | Extracts hyperlinks from the document page. |
| get_hyperlinks | Extracts hyperlinks from the document using customization options
(to set the rectangular area that contains hyperlinks). |
| get_hyperlinks | Extracts hyperlinks from the document page using customization options
(to set the rectangular area that contains hyperlinks). |
| get_barcodes | Extracts barcodes from the document. |
| get_barcodes | Extracts barcodes from the document page. |
| get_barcodes | Extracts barcodes from the document using customization options
(to set the rectangular area that contains barcodes). |
| get_barcodes | Extracts barcodes from the document using customization options. |
| get_barcodes | Extracts barcodes from the document page using customization options
(to set the rectangular area that contains barcodes). |
| get_barcodes | Extracts barcodes from the document page using customization options. |
| get_tables | Extracts tables from the document. |
| get_tables | Extracts tables from the document page. |
| get_tables | Extracts tables from the document. |
| get_tables | Extracts tables from the document page. |
| get_worksheet_info | Extracts the info about all worksheets in the spreadsheet. |
| get_worksheet_info | Extracts the info about the worksheet. |
| get_worksheet_cells | Extracts worksheet cells. |
| get_worksheet_cells | Extracts worksheet cells using customization options. |
| parse_by_template | Parses the document by the user-generated template. |
| parse_by_template | Parses the document by the user-generated template using customization options. |
| parse_by_template | Selects the most suitable template from the provided collection and then parses the document against the selected template. |
| parse_pages_by_template | Parses the document pages by the user-generated template. |
| parse_pages_by_template | Parses the document pages by the user-generated template using customization options. |
| generate_preview | Get pages preview. |
| get_document_info | Returns the general information about the document. |
| get_highlight | Extracts a highlight from the document. |
| get_toc | Extracts a table of contents from the document. |
| get_container | Extracts a container object from the document to work with formats that contain attachments, ZIP archives etc. |
| get_metadata | Extracts metadata from the document. |
| generate_adjustment_fields | Generates fields for automatic adjustment of template position and scale. |
| parse_form | Parses the document form. |
- module
groupdocs.parser - class
LoadOptions - class
Parser - class
ParserSettings
Was this page helpful?
Any additional feedback you'd like to share with us?
Please tell us how we can improve this page.
Thank you for your feedback!
We value your opinion. Your feedback will help us improve our documentation.