An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with structure-extraction

A curated list of projects in awesome lists tagged with structure-extraction .

https://github.com/rushi-balapure/pdf_2_json_extractor

A high-performance Python library for extracting structured content from PDF documents with layout-aware text extraction. pdf_to_json preserves document structure including headings (H1-H6) and body text, outputting clean JSON format.

cli-tool cpu-only cross-platform data-extraction document-parsing document-processing json layout-analysis nlp offline pdf pdf-extraction pdf-parser pdf-processing pdf-to-json python python-library structure-extraction text-extraction

Last synced: 21 Apr 2026