https://github.com/aescarias/pdfnaut
A Python library for exploring PDFs with ease.
https://github.com/aescarias/pdfnaut
pdf pdf-library pdf-parser python python3
Last synced: 18 days ago
JSON representation
A Python library for exploring PDFs with ease.
- Host: GitHub
- URL: https://github.com/aescarias/pdfnaut
- Owner: aescarias
- License: apache-2.0
- Created: 2023-12-19T20:08:49.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2026-03-30T17:59:17.000Z (27 days ago)
- Last Synced: 2026-03-30T19:31:14.059Z (27 days ago)
- Topics: pdf, pdf-library, pdf-parser, python, python3
- Language: Python
- Homepage: https://pdfnaut.readthedocs.io
- Size: 1.02 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# pdfnaut
[](https://pdfnaut.readthedocs.io/en/latest/?badge=latest)



> [!Warning]
> pdfnaut is currently in an early stage of development and has only been tested with a small set of compliant documents. Some non-compliant documents may work under strict=False. Expect bugs or issues.
pdfnaut aims to become a PDF processor for parsing PDF 2.0 files.
pdfnaut provides a high-level interface for reading and writing PDF documents as described in the [PDF 2.0 specification](https://developer.adobe.com/document-services/docs/assets/5b15559b96303194340b99820d3a70fa/PDF_ISO_32000-2.pdf) for actions such as reading and writing metadata, modifying and inserting pages, creating PDF objects, etc.
## Installation
pdfnaut requires at least Python 3.10 or later. To install pdfnaut via pip:
```plaintext
python -m pip install pdfnaut
```
If you plan to work with encrypted or protected PDF documents, you must install one of the supported crypt providers. See [Standard Security Handler](https://pdfnaut.readthedocs.io/en/latest/reference/standard_handler.html#standard-security-handler) in the documentation for details.
## Examples
Example 1: Accessing the content stream of a page
```python
from pdfnaut import PdfDocument
pdf = PdfDocument.from_filename("tests/docs/sample.pdf")
for operator in pdf.pages[0].content_stream:
print(operator)
```
Example 2: Reading document information
```py
from pdfnaut import PdfDocument
pdf = PdfDocument.from_filename("tests/docs/sample.pdf")
print(pdf.doc_info.title)
print(pdf.doc_info.author)
```
For more examples on what pdfnaut can do, see the [`examples` directory](https://github.com/aescarias/pdfnaut/tree/main/examples) in the repository or see the guides in the [documentation](https://pdfnaut.readthedocs.io/en/latest).
## Contributing
Contributions to pdfnaut should be done according to the [Contributing Guidelines](https://github.com/aescarias/pdfnaut/blob/main/CONTRIBUTING.md). You can contribute in many ways including adding small features, resolving issues, writing documentation, and more.
## License
pdfnaut is provided under the terms of the [Apache License 2.0](https://github.com/aescarias/pdfnaut/blob/main/LICENSE).