Projects in Awesome Lists tagged with document-layout-analysis
A curated list of projects in awesome lists tagged with document-layout-analysis .
https://github.com/layout-parser/layout-parser
A Unified Toolkit for Deep Learning Based Document Image Analysis
computer-vision deep-learning detectron2 document-image-processing document-layout-analysis layout-analysis layout-detection layout-parser object-detection ocr
Last synced: 13 May 2025
https://github.com/Layout-Parser/layout-parser
A Unified Toolkit for Deep Learning Based Document Image Analysis
computer-vision deep-learning detectron2 document-image-processing document-layout-analysis layout-analysis layout-detection layout-parser object-detection ocr
Last synced: 15 Mar 2025
https://github.com/deepdoctection/deepdoctection
A Repo For Document AI
document-ai document-image-analysis document-layout-analysis document-parser document-understanding layoutlm nlp ocr publaynet pubtabnet python pytorch table-detection table-recognition tensorflow
Last synced: 04 Jan 2026
https://github.com/BobLd/DocumentLayoutAnalysis
Document Layout Analysis resources repos for development with PdfPig.
alto alto-xml csharp docstrum document-layout-analysis hocr hocr-documents layout-analysis page-segmentation page-xml pdf pdfpig recursive-xy-cut table-extraction tei xy-cut xycut
Last synced: 10 May 2025
https://github.com/bobld/documentlayoutanalysis
Document Layout Analysis resources repos for development with PdfPig.
alto alto-xml csharp docstrum document-layout-analysis hocr hocr-documents layout-analysis page-segmentation page-xml pdf pdfpig recursive-xy-cut table-extraction tei xy-cut xycut
Last synced: 04 Apr 2025
https://github.com/explosion/spacy-layout
📚 Process PDFs, Word documents and more with spaCy
document-layout document-layout-analysis docx generative-ai natural-language-processing nlp pdf pdf-converter rag spacy
Last synced: 14 May 2025
https://github.com/qurator-spk/eynollah
Document Layout Analysis
binarization document-layout-analysis ocr segmentation textline-detection
Last synced: 16 Jan 2026
https://github.com/lquirosd/P2PaLA
Page to PAGE Layout Analysis Tool
computer-vision deep-neural-networks document-layout-analysis gan generative-adversarial-network handwritten-text-recognition image-segmentation page-xml pix2pix pytorch
Last synced: 02 Apr 2025
https://github.com/phamquiluan/publaynet
ICDAR 2019: MaskRCNN on PubLayNet datasets. Paragraph detection, table detection, figure detection,...
document-layout-analysis figure-detection mask-rcnn object-detection paragraph-detection pretrained-models publaynet pytorch table-detection
Last synced: 17 Aug 2025
https://github.com/phamquiluan/PubLayNet
ICDAR 2019: MaskRCNN on PubLayNet datasets. Paragraph detection, table detection, figure detection,...
document-layout-analysis figure-detection mask-rcnn object-detection paragraph-detection pretrained-models publaynet pytorch table-detection
Last synced: 20 Jul 2025
https://github.com/jpleorx/detectron2-publaynet
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
artificial-intelligence computer-vision deep-learning detectron2 document-analysis document-classification document-layout document-layout-analysis faster-rcnn instance-segmentation layout-analysis machine-learning neural-network neural-networks object-detection publaynet python python3 pytorch
Last synced: 10 May 2025
https://github.com/bobld/pdfpigmlnetblockclassifier
Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and image.
classifier csharp document-layout document-layout-analysis layout-analysis lightgbm machine-learning ml-net pdf pdf-document pdf-document-processor pdfpig publaynet
Last synced: 14 Apr 2025
https://github.com/bobld/simple-docstrum
A step-by-step C# implementation of the Docstrum algorithm
csharp docstrum document-layout-analysis dotnet pdf pdfpig
Last synced: 23 Jun 2025
https://github.com/hpanwar08/document-layout-analysis-app
Simple docker deployment of document layout analysis using detectron2
docker document-layout-analysis python reactjs
Last synced: 10 May 2025
https://github.com/bobld/publaynet-maskrcnn-mlnet
Using a MaskRCNN model trained on the PublayNet dataset with ML.Net in C# / .Net for Document layout analysis and page segmmentation task.
csharp document-layout-analysis dotnet figure-detection mask-detection mask-rcnn mlnet ocr onnx page-segmentation paragraph-detection pretrained-models publaynet table-detection
Last synced: 31 Jul 2025
https://github.com/stuartemiddleton/glosat_table_dataset
GloSAT Historical Measurement Table Dataset
artificial-intelligence dataset document-layout-analysis machine-learning table-detection table-structure-recognition
Last synced: 11 Apr 2025
https://github.com/bobld/pdfpigsvmregionclassifier
Proof of concept of a simple SVM Region Classifier using PdfPig and Accord.Net. The objective is to classify each text block in a pdf document page as either title, text, list, table and image.
accord-net csharp document-layout-analysis machine-learning pdf pdf-document pdfpig publaynet support-vector-machine svm svm-classifier svm-training
Last synced: 14 Apr 2025
https://github.com/huythai855/quizvista
Hệ thống sinh bà i thi trắc nghiệm sỠdụng trà tuệ nhân tạo - QuizVista
document-layout-analysis function-calling quiz-generator retrieval-augmented-generation
Last synced: 27 Aug 2025
https://github.com/mansurpro/docuparse
DocuParse is a high-performance tool for converting PDF documents into clean, structured Markdown files. Designed for speed and accuracy, it extracts and formats content while minimizing errors like hallucinations and repetitions.
digital-archive document-layout-analysis google-colab huggingface-transformers markdown-conversion pdf-parsing pdf-to-markdown tesseract-ocr text-extraction
Last synced: 19 Jan 2026