An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with document-conversion

A curated list of projects in awesome lists tagged with document-conversion .

https://github.com/c4illin/convertx

💾 Self-hosted online file converter. Supports 1000+ formats ⚙️

bun conversion convert converter document-conversion elysia file-conversion file-converter hacktoberfest pdf-converter self-hosted tailwindcss typescript

Last synced: 17 Jan 2026

https://github.com/zelon88/hrconvert2

A self-hosted, drag-and-drop & nosql file conversion server & share tool that supports 445 file formats in 13 languages.

archiver conversion converter document-conversion extractor file-converter file-sharing format image multilingual ocr ocr-recognition pdf-converter php server virustotal

Last synced: 16 May 2025

https://github.com/zelon88/HRConvert2

A self-hosted, drag-and-drop & nosql file conversion server & share tool that supports 445 file formats in 13 languages.

archiver conversion converter document-conversion extractor file-converter file-sharing format image multilingual ocr ocr-recognition pdf-converter php server virustotal

Last synced: 28 Mar 2025

https://github.com/blaspsoft/doxswap

📄 🔄 Doxswap is a Laravel package for seamless document conversion using LibreOffice. Effortlessly convert DOCX, PDF, ODT, and more with a simple, elegant API. Supports Laravel storage, configurable settings, and secure file handling.

converter document-conversion document-converter docx file-conversion laravel libreoffice odt pdf php xlsx

Last synced: 22 Apr 2025

https://github.com/mathpix/mpx-cli

CLI for document conversion for scientific documents, powered by Mathpix OCR

cli converter document-conversion format-converter html latex markdown pdf science static-site-generator

Last synced: 28 Jun 2025

https://github.com/Mathpix/mpx-cli

CLI for document conversion for scientific documents, powered by Mathpix OCR

cli converter document-conversion format-converter html latex markdown pdf science static-site-generator

Last synced: 09 Jul 2025

https://github.com/yaph/chatgpt-export

A browser bookmarklet for exporting conversations with ChatGPT as markdown files.

bookmarklet browser-bookmarklets browser-tools chatgpt chatgpt-tools document-conversion html2markdown markdown vanilla-javascript vanilla-js

Last synced: 09 Apr 2025

https://github.com/iamarunbrahma/pdf-to-markdown

Conversion of PDF documents to structured Markdown, optimized for Retrieval Augmented Generation (RAG) and other NLP tasks. Extract text, tables, and images with preserved formatting for enhanced information retrieval and processing.

document-conversion document-processing information-retrieval pdf-converter pdf-extraction pdf-parsing pdf-to-markdown python rag retrieval-augmented-generation text-extraction

Last synced: 10 Apr 2025

https://github.com/OpenSextant/Xponents

Geographic Place, Date/time, and Pattern entity extraction toolkit along with text extraction from unstructured data and GIS outputters.

document-conversion geocoding geonames geoparsing geotagging information-extraction nlp solr tika

Last synced: 06 Apr 2025

https://github.com/prexview/prexview-js

Transform your data from XML or JSON to high quality, beautiful and readable documents in PDF, HTML, PNG or JPG

convert converter document-conversion image json json-parser json-to-html json-to-pdf json-to-png prexview tranform xml xml-parser xml-to-html xml-to-pdf xml-to-png

Last synced: 10 Oct 2025

https://github.com/scivision/office-headless

Headless document conversion and printing using LibreOffice or Microsoft Office

asyncio document-conversion libreoffice-converter

Last synced: 14 Dec 2025

https://github.com/phlummox/pptx-to-md

Convert PowerPoint or LibreOffice Impress files to Beamer-friendly, Pandoc-style markdown

beamer document-conversion latex libreoffice markdown openoffice pdf powerpoint presentation-slides

Last synced: 24 Jun 2025

https://github.com/carboneio/carbone-sdk-go

Golang SDK to generate documents (PDF DOCX ODT XLSX ODS XML ...) with the Carbone Cloud API

document-conversion docx go golang libreoffice multilingual pdf pdf-generation report-generator sdk template-engine

Last synced: 14 Jan 2026

https://github.com/mondain-dev/convert-sep

To generate tufte-book style document for Stanford Encyclopedia of Philosophy (SEP) entries.

document-conversion html latex tex xelatex

Last synced: 02 May 2025

https://github.com/ilyashusterman/doc-to-readable

Universal document-to-markdown and section splitter for HTML, URLs, and PDFs.

docs document-conversion documents file-processing html javascript json markdown nodejs npm rag splitter

Last synced: 28 Jan 2026

https://github.com/prexview/prexview-php

Transform your data from XML or JSON to high quality, beautiful and readable documents in PDF, HTML, PNG or JPG.

convert converter document-conversion json json-parser json-to-html json-to-pdf json-to-png pdf prexview transform xml xml-parser xml-to-html xml-to-pdf xml-to-png

Last synced: 12 Feb 2026

https://github.com/sparkfish/glyphcast

Cairo-inspired dependency-free replacement for casting SVG to PNG or PDF format

cairo cairsvg converter document-conversion pdf png python resvg svg svglib

Last synced: 03 Oct 2025

https://github.com/reclamador/document_clipper

A set of utility classes and functions to process documents with Python

document-conversion document-management python python27

Last synced: 17 Aug 2025

https://github.com/trsdn/markitdown-mcp

đź“„ Professional MCP server for converting 29+ file formats to Markdown - Perfect for Claude Desktop and AI workflows!

ai-tools claude-desktop document-conversion file-conversion image-processing markdown markitdown mcp metadata-extraction model-context-protocol office-documents pdf-converter python speech-to-text

Last synced: 25 Sep 2025

https://github.com/prexview/prexview-studio

It gives you the power, flexibility, and speed you have always wanted to craft beautiful designs to generate high volume documents in PDF, HTML, PNG and JPG

convertion document-conversion json json-to-html json-to-pdf json-to-png transform xml xml-parsing xml-to-html xml-to-pdf xml-to-png

Last synced: 12 Apr 2025

https://github.com/richie-rich90454/training-generator

Training Generator is a cross-platform desktop app built with Electron and Node.js that converts documents (PDF, DOCX, DOC, RTF, TXT, MD, HTML) into structured AI training data. Using local Ollama models, it extracts instructions, Q&A pairs, and conversation data for machine learning, AI fine-tuning, and NLP workflows, while keeping all processing.

ai ai-data-analysis ai-training-data cpp desktop-app document-conversion electron html-css-javascript jsonl local-ai ml ollama ollama-api training-materials

Last synced: 26 Feb 2026

https://github.com/prexview/prexview-ruby

Transform your data from XML or JSON to high quality, beautiful and readable documents in PDF, HTML, PNG or JPG

converter document-conversion image json json-parser json-to-html json-to-pdf json-to-png pdf transform xml xml-parser xml-to-html xml-to-pdf xml-to-png

Last synced: 31 Jul 2025

https://github.com/prexview/prexview-shift

With PrexView Shift convert files without integrating them or modify your applications. It will read files from a folder and will convert all of them into the selected format.

apis convert document-conversion json json-to-html json-to-pdf json-to-png xml xml-to-html xml-to-pdf xml-to-png

Last synced: 12 Apr 2025

https://github.com/jai2dev/convert-to-pdf

Convert your documents in pdf format and extract information from them. Supports many extension like docs, docx, rtf etc

api-rest converter document-conversion flask pdf python3

Last synced: 30 Mar 2025

https://github.com/mohammedsafvan/docling-rs

Rust SDK for Docling Serve that makes document conversion simple, reliable, and production-ready in Rust

api-client document-conversion rust

Last synced: 18 Feb 2026

https://github.com/devexpress-examples/office-file-api-dockerize-application

Create and dockerize an ASP.NET Core Web API application that uses the Office File API library to convert Excel and Word documents to HTML.

aspnetcore docker document-conversion dotnet excel net8 office-file-api spreadsheet word

Last synced: 23 Jul 2025

https://github.com/madstone-tech/veve-cli

Fast, themeable markdown-to-PDF converter built with Go https://github.com/madstone-tech/veve-cli

cli command-line-tool converter document-conversion golang markdown pandoc pdf pdf-generation

Last synced: 20 Feb 2026

https://github.com/furqanhun/textnomnom-py

Extract text from PDFs, PPTs, & URLs (with OCR support). Converts PPT to PDF & handles files or folders. 🦍

automated-conversion automation cross-platform document-conversion image-text-extraction linux pdf-processing pdf-to-text ppt ppt-to-text pptx pptx-to-text text-extraction windows

Last synced: 23 Mar 2025

https://github.com/hreikin/pdf-toolbox

Extract content from PDF's and convert or create new documents from the content in multiple output formats.

adobe document-conversion document-converter document-creation document-creator document-extraction image-extraction pandoc pymupdf pypandoc python python3 scrapy text-extraction

Last synced: 09 Jul 2025

https://github.com/keenlychuang/mdx-to-docs

Lightweight Python script to convert directory of mdx files to pdf or docx

content-processing document-conversion docx file-conversion markdown mdx pdf-generation python-scripts

Last synced: 05 Oct 2025

https://github.com/noflo/noflo-tika

Document extraction components for NoFlo

document-conversion noflo tika

Last synced: 25 Feb 2025

https://github.com/nickgeek/abstract

A small CLI tool to turn µPad-flavoured markdown into pdf documents

document document-conversion markdown notes

Last synced: 10 Oct 2025

https://github.com/tristan-mcinnis/notion-to-word

Convert Notion pages to beautifully formatted Word documents with custom styling and templates

conversion document-conversion docx markdown notion notion-api notion-automation notion-integration notion2word python word

Last synced: 25 Sep 2025

https://github.com/huser123/docx_hirdetes_feldolgozo_wp_betheme-hez

A nekünk docx-ben küldött hirdetések feldolgozása a Wordpress Betheme témájához.

automation betheme cms content-processing document-conversion docx open-source python web-development wordpress

Last synced: 28 Jun 2025

https://github.com/zangadze1101/convert

📝 Convert Markdown to HTML automatically with GitHub Pages deployment, making content management simple and efficient for your projects.

clash containers convert converter docker-compose document-conversion elysia esp8266 iot measure python quantumult quantumultx self-hosted smarthome tuya-smart units v2ray

Last synced: 04 Nov 2025

https://github.com/hlexnc/document-conversion-solutions

Document Conversion Solutions 📄🔄 - A comprehensive suite of tools for document conversion, including Python APIs, JavaScript solutions, and an npm package. Dedicated to simplifying the export of DOCX, PPTX, and XLSX documents.

api code-snippets document-conversion docx fastapi html javascript markdown nextjs npm-package officegen open-source pptx python typescript utility

Last synced: 30 Dec 2025

https://github.com/ovler-young/sciencedirect2markdown

A Python tool to convert ScienceDirect JSON content to Markdown format, supporting text styling, math formulas, tables, and figures.

converter document-conversion markdown sciencedirect scientific-papers streamlit text-processing

Last synced: 08 Apr 2025