Projects in Awesome Lists tagged with docx-parser
A curated list of projects in awesome lists tagged with docx-parser .
https://github.com/ispras/dedoc
Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser
doc document-analysis document-content-extraction documents docx docx-parser excel html html-parser logical-structure-extraction ocr odt pdf pdf-parser scanned-documents table-of-contents table-recognition txt
Last synced: 15 May 2025
https://github.com/has-abi/docparser
Extract text from your DOCX documents.
doc-parser document-parser docx-parser text-parser
Last synced: 25 Nov 2024
https://github.com/omar2535/biolife-au-01-attendance-parser
Biolife-AU-01 打卡鐘解析程序
docx docx-parser html-parser parser
Last synced: 18 Mar 2025
https://github.com/lukethacoder/docx-to-html
📃 A GUI based docx to html parser. Useful for ripping out inline styles of docx files.
Last synced: 08 Apr 2025
https://github.com/fayazk/document-metadata-extractor
A Python tool that uses Google's Gemini AI to automatically extract structured metadata from PDF and DOCX documents, saving results to Excel for easy analysis and organizing raw responses as JSON files.
content-indexing data-extraction document-management document-processing docx-parser excel-export gemini-ai-project generative-ai json-output metadata-extraction nlp pdf-parser python-automation text-analysis
Last synced: 01 Apr 2025