Extract Text from Excel Sheet in Raw Mode

Introduction

In this tutorial, we’ll explore how to extract text from Excel sheets using GroupDocs.Parser for .NET in raw mode. GroupDocs.Parser is a powerful API that allows developers to work with various document formats, including Excel files, for text extraction and analysis. We’ll go through the prerequisites, import namespaces, and break down each step to demonstrate the process of extracting text from Excel sheets.

Prerequisites

Before getting started, ensure you have the following prerequisites set up:

  • Visual Studio: Install Visual Studio IDE on your machine.
  • GroupDocs.Parser for .NET: Download and install GroupDocs.Parser from the download page.
  • Sample Excel File: Prepare a sample Excel file that you’ll use for text extraction.

Import Namespaces

Begin by importing the necessary namespaces into your C# project to access the functionalities of GroupDocs.Parser:

using System;
using System.Collections.Generic;
using System.IO;
using System.Text;
using GroupDocs.Parser.Data;
using GroupDocs.Parser.Options;

Step 1: Create an Instance of Parser Class

First, create an instance of the Parser class by providing the path to your sample Excel file:

using (Parser parser = new Parser("YourSampleFile.xlsx"))
{
    // Your code for text extraction will go here
}

Step 2: Get Document Information

Retrieve document information using the GetDocumentInfo() method:

IDocumentInfo documentInfo = parser.GetDocumentInfo();

Step 3: Iterate Over Sheets

Loop through each sheet in the Excel file:

for (int p = 0; p < documentInfo.RawPageCount; p++)
{
    Console.WriteLine(string.Format("Page {0}/{1}", p + 1, documentInfo.RawPageCount));
    
    // Your code for text extraction from each sheet will go here
}

Step 4: Extract Text from Each Sheet

Extract text from each sheet using a TextReader:

using (TextReader reader = parser.GetText(p, new TextOptions(true)))
{
    Console.WriteLine(reader.ReadToEnd());
}

Conclusion

In this tutorial, we’ve covered how to extract text from Excel sheets using GroupDocs.Parser for .NET. By following the steps outlined above, you can efficiently retrieve text data from Excel files for further processing or analysis in your .NET applications.

FAQ’s

Can GroupDocs.Parser extract text from other document formats?

Yes, GroupDocs.Parser supports a wide range of document formats including Word, PDF, PowerPoint, and more.

Is GroupDocs.Parser suitable for processing large Excel files?

Yes, GroupDocs.Parser is designed to handle large documents efficiently.

Where can I find more documentation about GroupDocs.Parser?

You can refer to the documentation for detailed information and examples.

How can I obtain a temporary license for GroupDocs.Parser?

Visit this link to request a temporary license.

Does GroupDocs.Parser offer customer support?

Yes, you can seek assistance or ask questions on the GroupDocs forum.