GetFormattedText
Contents
[
Hide
]
GetFormattedText(FormattedTextOptions)
Extracts a formatted text from the document.
public TextReader GetFormattedText(FormattedTextOptions options)
Parameter | Type | Description |
---|---|---|
options | FormattedTextOptions | The formatted text extraction options. |
Return Value
An instance of TextReader class with the extracted text; null
if formatted text extraction isn’t supported.
Remarks
Learn more:
- Extract formatted text from document
- Extract a document text as HTML
- Extract a document text as Markdown
- Extract a document text as Plain text
Examples
The following example shows how to extract a document text as HTML text:
// Create an instance of Parser class
using (Parser parser = new Parser(filePath))
{
// Extract a formatted text into the reader
using (TextReader reader = parser.GetFormattedText(new FormattedTextOptions(FormattedTextMode.Html)))
{
// Print a formatted text from the document
// If formatted text extraction isn't supported, a reader is null
Console.WriteLine(reader == null ? "Formatted text extraction isn't suppported" : reader.ReadToEnd());
}
}
See Also
- class FormattedTextOptions
- class Parser
- namespace GroupDocs.Parser
- assembly GroupDocs.Parser
GetFormattedText(int, FormattedTextOptions)
Extracts a formatted text from the document page.
public TextReader GetFormattedText(int pageIndex, FormattedTextOptions options)
Parameter | Type | Description |
---|---|---|
pageIndex | Int32 | The zero-based page index. |
options | FormattedTextOptions | The formatted text extraction options. |
Return Value
An instance of TextReader class with the extracted text; null
if formatted text page extraction isn’t supported.
Remarks
Learn more:
- Extract formatted text from document page
- Extract a document text as HTML
- Extract a document text as Markdown
- Extract a document text as Plain text
Examples
The following example shows how to extract a document page text as Markdown text:
// Create an instance of Parser class
using (Parser parser = new Parser(filePath))
{
// Check if the document supports formatted text extraction
if (!parser.Features.FormattedText)
{
Console.WriteLine("Document isn't supports formatted text extraction.");
return;
}
// Get the document info
IDocumentInfo documentInfo = parser.GetDocumentInfo();
// Check if the document has pages
if (documentInfo.PageCount == 0)
{
Console.WriteLine("Document hasn't pages.");
return;
}
// Iterate over pages
for (int p = 0; p<documentInfo.PageCount; p++)
{
// Print a page number
Console.WriteLine(string.Format("Page {0}/{1}", p + 1, documentInfo.PageCount));
// Extract a formatted text into the reader
using (TextReader reader = parser.GetFormattedText(p, new FormattedTextOptions(FormattedTextMode.Markdown)))
{
// Print a formatted text from the document
// We ignore null-checking as we have checked formatted text extraction feature support earlier
Console.WriteLine(reader.ReadToEnd());
}
}
}
See Also
- class FormattedTextOptions
- class Parser
- namespace GroupDocs.Parser
- assembly GroupDocs.Parser