Load Document from Local Disk
Introduction
In this tutorial, we will explore how to use GroupDocs.Parser for .NET to extract text from documents. GroupDocs.Parser is a powerful library that allows developers to parse various document formats and extract text content programmatically. We’ll cover the necessary steps to get started with text extraction using this library.
Prerequisites
Before we begin, ensure you have the following prerequisites installed:
- Visual Studio installed on your system.
- Basic knowledge of C# programming language.
- GroupDocs.Parser for .NET library installed (download here).
Import Namespaces
First, you need to import the necessary namespaces to your C# project:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
Step 1: Load Document from Local Disk
Begin by loading a document from your local disk. Replace "Your Sample File"
with the path to your target document.
// Set the filePath
string filePath = "Your Sample File";
// Create an instance of Parser class with the filePath
using (Parser parser = new Parser(filePath))
{
// Extract text into the reader
using (TextReader reader = parser.GetText())
{
// Print the extracted text from the document
// If text extraction isn't supported, the reader will be null
Console.WriteLine(reader == null ? "Text extraction isn't supported" : reader.ReadToEnd());
}
}
Explanation of Steps
- Setting File Path: Begin by specifying the path to the document you want to extract text from (
filePath
variable). - Creating Parser Instance: Instantiate the
Parser
class by passing thefilePath
. - Extracting Text: Use the
GetText()
method of theParser
instance to obtain aTextReader
object containing the extracted text from the document. - Reading Extracted Text: Utilize the
ReadToEnd()
method of theTextReader
to retrieve the entire text content extracted from the document. - Handling Unsupported Formats: If the document format does not support text extraction, the
reader
object will benull
, and you can handle this scenario accordingly.
Conclusion
In this tutorial, we’ve covered the initial steps to extract text from a document using GroupDocs.Parser for .NET. This library offers extensive features for document parsing, enabling developers to efficiently work with various file formats within their applications.
FAQ’s
Is GroupDocs.Parser compatible with all document formats?
GroupDocs.Parser supports a wide range of formats including PDF, Microsoft Office documents (Word, Excel, PowerPoint), and more.
Can I extract metadata along with text using GroupDocs.Parser?
Yes, GroupDocs.Parser allows extraction of both text content and metadata from supported document formats.
Where can I find more resources and support for GroupDocs.Parser?
Visit the GroupDocs.Parser Documentation for detailed API reference and explore the GroupDocs Forum for community support.
How can I obtain a temporary license for GroupDocs.Parser?
You can request a temporary license for evaluation and testing purposes.
Is there a free trial available for GroupDocs.Parser?
Yes, you can download a free trial version of GroupDocs.Parser.