https://github.com/bobld/pdfpigmlnetblockclassifier
Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and image.
https://github.com/bobld/pdfpigmlnetblockclassifier
classifier csharp document-layout document-layout-analysis layout-analysis lightgbm machine-learning ml-net pdf pdf-document pdf-document-processor pdfpig publaynet
Last synced: 10 months ago
JSON representation
Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and image.
- Host: GitHub
- URL: https://github.com/bobld/pdfpigmlnetblockclassifier
- Owner: BobLd
- Created: 2020-01-15T11:53:27.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2020-03-16T01:33:43.000Z (almost 6 years ago)
- Last Synced: 2025-04-06T07:38:52.025Z (10 months ago)
- Topics: classifier, csharp, document-layout, document-layout-analysis, layout-analysis, lightgbm, machine-learning, ml-net, pdf, pdf-document, pdf-document-processor, pdfpig, publaynet
- Language: C#
- Homepage:
- Size: 1.1 MB
- Stars: 27
- Watchers: 3
- Forks: 6
- Open Issues: 0