https://github.com/UglyToad/PdfPig
Read and extract text and other content from PDFs in C# (port of PDFBox)
https://github.com/UglyToad/PdfPig
alto-xml csharp document-analysis hocr layout-analysis netstandard page-xml pdf pdf-document pdf-document-processor pdf-extractor pdf-files pdf-generation pdfbox
Last synced: 29 days ago
JSON representation
Read and extract text and other content from PDFs in C# (port of PDFBox)
- Host: GitHub
- URL: https://github.com/UglyToad/PdfPig
- Owner: UglyToad
- License: apache-2.0
- Created: 2017-11-09T19:15:21.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2025-03-09T14:05:05.000Z (about 1 month ago)
- Last Synced: 2025-03-19T23:16:06.100Z (about 1 month ago)
- Topics: alto-xml, csharp, document-analysis, hocr, layout-analysis, netstandard, page-xml, pdf, pdf-document, pdf-document-processor, pdf-extractor, pdf-files, pdf-generation, pdfbox
- Language: C#
- Homepage: https://github.com/UglyToad/PdfPig/wiki
- Size: 166 MB
- Stars: 1,927
- Watchers: 48
- Forks: 252
- Open Issues: 74
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
- awesome-document-understanding - PdfPig - This project allows users to read and extract text and other content from PDF files. In addition the library can be used to create simple PDF documents containing text and geometrical shapes. This project aims to port PDFBox to C# (Resources)