Get document info

GroupDocs.Watermark can obtain a document’s basic information — file type, page count, and size — before you process it. The get_document_info() method works on a Watermarker opened from either a file path or a stream.

Get document information from a file

from groupdocs.watermark import Watermarker

def get_document_info():
    with Watermarker("./sample.docx") as watermarker:
        info = watermarker.get_document_info()
        print("File type:", info.file_type)
        print("Pages:", info.page_count)
        print("Size, bytes:", info.size)

if __name__ == "__main__":
    get_document_info()

sample.docx is the sample file used in this example. Click here to download it.

File type: Docx (.docx) - WordProcessing
Pages: 3
Size, bytes: 121298

Download full output

Get document information from a stream

A document can also be opened from any readable stream:

import io
from groupdocs.watermark import Watermarker

def get_document_info_from_stream():
    with open("./sample.docx", "rb") as f:
        stream = io.BytesIO(f.read())

    with Watermarker(stream) as watermarker:
        info = watermarker.get_document_info()
        print("File type:", info.file_type)
        print("Pages:", info.page_count)
        print("Size, bytes:", info.size)

if __name__ == "__main__":
    get_document_info_from_stream()

sample.docx is the sample file used in this example. Click here to download it.

File type: Docx (.docx) - WordProcessing
Pages: 3
Size, bytes: 121298

Download full output

Use case: Validate documents before processing — for example, reject unsupported file types or confirm page limits.