Projects in Awesome Lists tagged with document-conversion
A curated list of projects in awesome lists tagged with document-conversion .
https://github.com/documize/community
Modern Confluence alternative designed for internal & external docs, built with Go + EmberJS
collaboration confluence dashboards docs document-conversion document-management documentation documentation-tool emberjs enterprise go keycloak knowledge knowledge-base knowledge-management knowledgebase reporting selfhosted wiki
Last synced: 13 May 2025
https://github.com/c4illin/convertx
💾 Self-hosted online file converter. Supports 1000+ formats ⚙️
bun conversion convert converter document-conversion elysia file-conversion file-converter hacktoberfest pdf-converter self-hosted tailwindcss typescript
Last synced: 17 Jan 2026
https://github.com/carboneio/carbone
Fast and simple report generator, from JSON to pdf, xslx, docx, odt...
carbone document-conversion javascript libreoffice microsoft-office multilingual nodejs pdf-generation report-generator template-engine
Last synced: 17 Jan 2026
https://github.com/zelon88/hrconvert2
A self-hosted, drag-and-drop & nosql file conversion server & share tool that supports 445 file formats in 13 languages.
archiver conversion converter document-conversion extractor file-converter file-sharing format image multilingual ocr ocr-recognition pdf-converter php server virustotal
Last synced: 16 May 2025
https://github.com/zelon88/HRConvert2
A self-hosted, drag-and-drop & nosql file conversion server & share tool that supports 445 file formats in 13 languages.
archiver conversion converter document-conversion extractor file-converter file-sharing format image multilingual ocr ocr-recognition pdf-converter php server virustotal
Last synced: 28 Mar 2025
https://github.com/blaspsoft/doxswap
📄 🔄 Doxswap is a Laravel package for seamless document conversion using LibreOffice. Effortlessly convert DOCX, PDF, ODT, and more with a simple, elegant API. Supports Laravel storage, configurable settings, and secure file handling.
converter document-conversion document-converter docx file-conversion laravel libreoffice odt pdf php xlsx
Last synced: 22 Apr 2025
https://github.com/mathpix/mpx-cli
CLI for document conversion for scientific documents, powered by Mathpix OCR
cli converter document-conversion format-converter html latex markdown pdf science static-site-generator
Last synced: 28 Jun 2025
https://github.com/Mathpix/mpx-cli
CLI for document conversion for scientific documents, powered by Mathpix OCR
cli converter document-conversion format-converter html latex markdown pdf science static-site-generator
Last synced: 09 Jul 2025
https://github.com/yaph/chatgpt-export
A browser bookmarklet for exporting conversations with ChatGPT as markdown files.
bookmarklet browser-bookmarklets browser-tools chatgpt chatgpt-tools document-conversion html2markdown markdown vanilla-javascript vanilla-js
Last synced: 09 Apr 2025
https://github.com/iamarunbrahma/pdf-to-markdown
Conversion of PDF documents to structured Markdown, optimized for Retrieval Augmented Generation (RAG) and other NLP tasks. Extract text, tables, and images with preserved formatting for enhanced information retrieval and processing.
document-conversion document-processing information-retrieval pdf-converter pdf-extraction pdf-parsing pdf-to-markdown python rag retrieval-augmented-generation text-extraction
Last synced: 10 Apr 2025
https://github.com/OpenSextant/Xponents
Geographic Place, Date/time, and Pattern entity extraction toolkit along with text extraction from unstructured data and GIS outputters.
document-conversion geocoding geonames geoparsing geotagging information-extraction nlp solr tika
Last synced: 06 Apr 2025
https://github.com/prexview/prexview-js
Transform your data from XML or JSON to high quality, beautiful and readable documents in PDF, HTML, PNG or JPG
convert converter document-conversion image json json-parser json-to-html json-to-pdf json-to-png prexview tranform xml xml-parser xml-to-html xml-to-pdf xml-to-png
Last synced: 10 Oct 2025
https://github.com/scivision/office-headless
Headless document conversion and printing using LibreOffice or Microsoft Office
asyncio document-conversion libreoffice-converter
Last synced: 14 Dec 2025
https://github.com/phlummox/pptx-to-md
Convert PowerPoint or LibreOffice Impress files to Beamer-friendly, Pandoc-style markdown
beamer document-conversion latex libreoffice markdown openoffice pdf powerpoint presentation-slides
Last synced: 24 Jun 2025
https://github.com/carboneio/carbone-sdk-go
Golang SDK to generate documents (PDF DOCX ODT XLSX ODS XML ...) with the Carbone Cloud API
document-conversion docx go golang libreoffice multilingual pdf pdf-generation report-generator sdk template-engine
Last synced: 14 Jan 2026
https://github.com/mondain-dev/convert-sep
To generate tufte-book style document for Stanford Encyclopedia of Philosophy (SEP) entries.
document-conversion html latex tex xelatex
Last synced: 02 May 2025
https://github.com/keithweaver/ibm-watson-php-sdk
PHP SDK for IBM Watson. Making easier to build Watson powered apps in PHP.
classifier document-conversion ibm ibm-bluemix ibm-watson ibm-watson-api ibm-watson-services machine-learning natural-language natural-language-processing retrieve-and-rank
Last synced: 16 Jan 2026
https://github.com/ilyashusterman/doc-to-readable
Universal document-to-markdown and section splitter for HTML, URLs, and PDFs.
docs document-conversion documents file-processing html javascript json markdown nodejs npm rag splitter
Last synced: 28 Jan 2026
https://github.com/madsjulia/documentfunction.jl
Document Julia Functions (methods, arguments, keywords)
document-conversion documentation documentation-generator documentation-tool julia julia-language julialang mads
Last synced: 10 Apr 2025
https://github.com/carboneio/carbone-sdk-node
The node SDK to use Carbone render API easily.
document-conversion libreoffice microsoft-office nodejs pdf-generation report-generator template-engine
Last synced: 17 Jan 2026
https://github.com/prexview/prexview-php
Transform your data from XML or JSON to high quality, beautiful and readable documents in PDF, HTML, PNG or JPG.
convert converter document-conversion json json-parser json-to-html json-to-pdf json-to-png pdf prexview transform xml xml-parser xml-to-html xml-to-pdf xml-to-png
Last synced: 12 Feb 2026
https://github.com/reclamador/document_clipper
A set of utility classes and functions to process documents with Python
document-conversion document-management python python27
Last synced: 17 Aug 2025
https://github.com/trsdn/markitdown-mcp
đź“„ Professional MCP server for converting 29+ file formats to Markdown - Perfect for Claude Desktop and AI workflows!
ai-tools claude-desktop document-conversion file-conversion image-processing markdown markitdown mcp metadata-extraction model-context-protocol office-documents pdf-converter python speech-to-text
Last synced: 25 Sep 2025
https://github.com/prexview/prexview-studio
It gives you the power, flexibility, and speed you have always wanted to craft beautiful designs to generate high volume documents in PDF, HTML, PNG and JPG
convertion document-conversion json json-to-html json-to-pdf json-to-png transform xml xml-parsing xml-to-html xml-to-pdf xml-to-png
Last synced: 12 Apr 2025
https://github.com/richie-rich90454/training-generator
Training Generator is a cross-platform desktop app built with Electron and Node.js that converts documents (PDF, DOCX, DOC, RTF, TXT, MD, HTML) into structured AI training data. Using local Ollama models, it extracts instructions, Q&A pairs, and conversation data for machine learning, AI fine-tuning, and NLP workflows, while keeping all processing.
ai ai-data-analysis ai-training-data cpp desktop-app document-conversion electron html-css-javascript jsonl local-ai ml ollama ollama-api training-materials
Last synced: 26 Feb 2026
https://github.com/prexview/prexview-ruby
Transform your data from XML or JSON to high quality, beautiful and readable documents in PDF, HTML, PNG or JPG
converter document-conversion image json json-parser json-to-html json-to-pdf json-to-png pdf transform xml xml-parser xml-to-html xml-to-pdf xml-to-png
Last synced: 31 Jul 2025
https://github.com/prexview/prexview-shift
With PrexView Shift convert files without integrating them or modify your applications. It will read files from a folder and will convert all of them into the selected format.
apis convert document-conversion json json-to-html json-to-pdf json-to-png xml xml-to-html xml-to-pdf xml-to-png
Last synced: 12 Apr 2025
https://github.com/katef/minidocbook
A subset of docbook, and rendering to PDF and HTML
docbook docs document-conversion documentation documentation-tool layout pdf-generation tex typography
Last synced: 03 Jan 2026
https://github.com/jai2dev/convert-to-pdf
Convert your documents in pdf format and extract information from them. Supports many extension like docs, docx, rtf etc
api-rest converter document-conversion flask pdf python3
Last synced: 30 Mar 2025
https://github.com/mohammedsafvan/docling-rs
Rust SDK for Docling Serve that makes document conversion simple, reliable, and production-ready in Rust
api-client document-conversion rust
Last synced: 18 Feb 2026
https://github.com/devexpress-examples/office-file-api-dockerize-application
Create and dockerize an ASP.NET Core Web API application that uses the Office File API library to convert Excel and Word documents to HTML.
aspnetcore docker document-conversion dotnet excel net8 office-file-api spreadsheet word
Last synced: 23 Jul 2025
https://github.com/C4illin/ConvertX
đź’ľ Self-hosted online file converter. Supports 700+ formats
bun conversion convert converter document-conversion elysia file-conversion file-converter pdf-converter picocss self-hosted typescript
Last synced: 06 Aug 2025
https://github.com/madstone-tech/veve-cli
Fast, themeable markdown-to-PDF converter built with Go https://github.com/madstone-tech/veve-cli
cli command-line-tool converter document-conversion golang markdown pandoc pdf pdf-generation
Last synced: 20 Feb 2026
https://github.com/furqanhun/textnomnom-py
Extract text from PDFs, PPTs, & URLs (with OCR support). Converts PPT to PDF & handles files or folders. 🦍
automated-conversion automation cross-platform document-conversion image-text-extraction linux pdf-processing pdf-to-text ppt ppt-to-text pptx pptx-to-text text-extraction windows
Last synced: 23 Mar 2025
https://github.com/alexcoder04/docconvert
More than a document converter
bootstrap bootstrap5 convert document document-conversion document-converter documents docx gin go go-gin golang gui markdown odt pandoc random random-number-generator random-numbers server
Last synced: 03 Sep 2025
https://github.com/hreikin/pdf-toolbox
Extract content from PDF's and convert or create new documents from the content in multiple output formats.
adobe document-conversion document-converter document-creation document-creator document-extraction image-extraction pandoc pymupdf pypandoc python python3 scrapy text-extraction
Last synced: 09 Jul 2025
https://github.com/keenlychuang/mdx-to-docs
Lightweight Python script to convert directory of mdx files to pdf or docx
content-processing document-conversion docx file-conversion markdown mdx pdf-generation python-scripts
Last synced: 05 Oct 2025
https://github.com/noflo/noflo-tika
Document extraction components for NoFlo
document-conversion noflo tika
Last synced: 25 Feb 2025
https://github.com/nickgeek/abstract
A small CLI tool to turn µPad-flavoured markdown into pdf documents
document document-conversion markdown notes
Last synced: 10 Oct 2025
https://github.com/tristan-mcinnis/notion-to-word
Convert Notion pages to beautifully formatted Word documents with custom styling and templates
conversion document-conversion docx markdown notion notion-api notion-automation notion-integration notion2word python word
Last synced: 25 Sep 2025
https://github.com/huser123/docx_hirdetes_feldolgozo_wp_betheme-hez
A nekünk docx-ben küldött hirdetések feldolgozása a Wordpress Betheme témájához.
automation betheme cms content-processing document-conversion docx open-source python web-development wordpress
Last synced: 28 Jun 2025
https://github.com/zangadze1101/convert
📝 Convert Markdown to HTML automatically with GitHub Pages deployment, making content management simple and efficient for your projects.
clash containers convert converter docker-compose document-conversion elysia esp8266 iot measure python quantumult quantumultx self-hosted smarthome tuya-smart units v2ray
Last synced: 04 Nov 2025
https://github.com/hlexnc/document-conversion-solutions
Document Conversion Solutions 📄🔄 - A comprehensive suite of tools for document conversion, including Python APIs, JavaScript solutions, and an npm package. Dedicated to simplifying the export of DOCX, PPTX, and XLSX documents.
api code-snippets document-conversion docx fastapi html javascript markdown nextjs npm-package officegen open-source pptx python typescript utility
Last synced: 30 Dec 2025
https://github.com/ovler-young/sciencedirect2markdown
A Python tool to convert ScienceDirect JSON content to Markdown format, supporting text styling, math formulas, tables, and figures.
converter document-conversion markdown sciencedirect scientific-papers streamlit text-processing
Last synced: 08 Apr 2025