An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with docx-parser

A curated list of projects in awesome lists tagged with docx-parser .

https://github.com/ispras/dedoc

Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser

doc document-analysis document-content-extraction documents docx docx-parser excel html html-parser logical-structure-extraction ocr odt pdf pdf-parser scanned-documents table-of-contents table-recognition txt

Last synced: 15 May 2025

https://github.com/has-abi/docparser

Extract text from your DOCX documents.

doc-parser document-parser docx-parser text-parser

Last synced: 25 Nov 2024

https://github.com/omar2535/biolife-au-01-attendance-parser

Biolife-AU-01 打卡鐘解析程序

docx docx-parser html-parser parser

Last synced: 18 Mar 2025

https://github.com/lukethacoder/docx-to-html

📃 A GUI based docx to html parser. Useful for ripping out inline styles of docx files.

docx docx-parser rich-text

Last synced: 08 Apr 2025

https://github.com/fayazk/document-metadata-extractor

A Python tool that uses Google's Gemini AI to automatically extract structured metadata from PDF and DOCX documents, saving results to Excel for easy analysis and organizing raw responses as JSON files.

content-indexing data-extraction document-management document-processing docx-parser excel-export gemini-ai-project generative-ai json-output metadata-extraction nlp pdf-parser python-automation text-analysis

Last synced: 01 Apr 2025