An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with pdf-converter

A curated list of projects in awesome lists tagged with pdf-converter .

https://github.com/stirling-tools/stirling-pdf

#1 Locally hosted web application that allows you to perform various operations on PDF files

docker java pdf pdf-converter pdf-editor pdf-manipulation pdf-merger pdf-ocr pdf-tools pdf-web-apps pdfmerger

Last synced: 12 May 2025

https://github.com/opendatalab/mineru

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。

ai4science document-analysis extract-data layout-analysis ocr parser pdf pdf-converter pdf-extractor-llm pdf-extractor-pretrain pdf-extractor-rag pdf-parser python

Last synced: 12 May 2025

https://github.com/opendatalab/MinerU

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。

ai4science document-analysis extract-data layout-analysis ocr parser pdf pdf-converter pdf-extractor-llm pdf-extractor-pretrain pdf-extractor-rag pdf-parser python

Last synced: 24 Mar 2025

https://github.com/Stirling-Tools/Stirling-PDF

locally hosted web application that allows you to perform various operations on PDF files

docker java pdf pdf-converter pdf-manipulation pdf-merger pdf-ocr pdf-tools pdf-web-apps pdfmerger

Last synced: 24 Mar 2025

https://github.com/wmjordan/pdfpatcher

PDF补丁丁——PDF工具箱,可以编辑书签、剪裁旋转页面、解除限制、提取或合并文档,探查文档结构,提取图片、转成图片等等

pdf pdf-converter pdf-document-processor pdf-generation

Last synced: 12 May 2025

https://github.com/wmjordan/PDFPatcher

PDF补丁丁——PDF工具箱,可以编辑书签、剪裁旋转页面、解除限制、提取或合并文档,探查文档结构,提取图片、转成图片等等

pdf pdf-converter pdf-document-processor pdf-generation

Last synced: 24 Mar 2025

https://github.com/gotenberg/gotenberg

A developer-friendly API for converting numerous document formats into PDF files, and more!

api chrome chromium convert-to-pdf docker docx-to-pdf excel exiftool html-to-pdf libreoffice openoffice pdf pdf-converter pdftk puppeteer qpdf screenshots unoconv wkhtmltopdf word

Last synced: 12 May 2025

https://github.com/thecodingmachine/gotenberg

A developer-friendly API for converting numerous document formats into PDF files, and more!

api chrome chromium convert-to-pdf docker docx-to-pdf excel exiftool html-to-pdf libreoffice openoffice pdf pdf-converter pdftk puppeteer qpdf screenshots unoconv wkhtmltopdf word

Last synced: 26 Apr 2025

https://github.com/marcbachmann/node-html-pdf

This repo isn't maintained anymore as phantomjs got dreprecated a long time ago. Please migrate to headless chrome/puppeteer.

html pdf pdf-converter phantomjs

Last synced: 13 May 2025

https://github.com/jorisschellekens/borb

borb is a library for reading, creating and manipulating PDF files in python.

library pdf pdf-conversion pdf-converter pdf-generation pdf-library python python3 sdk typesetting

Last synced: 14 May 2025

https://github.com/artifexsoftware/pdf2docx

Open source Python library for converting PDF to DOCX.

docx extract-table pdf-converter pdf-to-word pymupdf

Last synced: 14 May 2025

https://github.com/ArtifexSoftware/pdf2docx

Open source Python library for converting PDF to DOCX.

docx extract-table pdf-converter pdf-to-word pymupdf

Last synced: 28 Mar 2025

https://github.com/dothinking/pdf2docx

Open source Python library for converting PDF to DOCX.

docx extract-table pdf-converter pdf-to-word pymupdf

Last synced: 21 Dec 2024

https://github.com/arachnys/athenapdf

Drop-in replacement for wkhtmltopdf built on Go, Electron and Docker

aws-ecs cli docker electron go golang html-to-pdf javascript kubernetes microservice pdf-conversion pdf-converter report

Last synced: 18 Jan 2025

https://github.com/modesty/pdf2json

converts binary PDF to JSON and text, for server-side PDF processing and command-line use.

json pdf pdf-converter pdf-form pdf-text pdf2form pdf2json pdf2text

Last synced: 12 May 2025

https://github.com/sajari/docconv

Converts PDF, DOC, DOCX, XML, HTML, RTF, etc to plain text

conversion docs docx go html pdf pdf-converter rtf rtf-files word xml

Last synced: 15 May 2025

https://github.com/jzillmann/pdf-to-markdown

A PDF to Markdown converter

markdown pdf-converter

Last synced: 08 Apr 2025

https://github.com/zelon88/hrconvert2

A self-hosted, drag-and-drop & nosql file conversion server & share tool that supports 445 file formats in 13 languages.

archiver conversion converter document-conversion extractor file-converter file-sharing format image multilingual ocr ocr-recognition pdf-converter php server virustotal

Last synced: 16 May 2025

https://github.com/zelon88/HRConvert2

A self-hosted, drag-and-drop & nosql file conversion server & share tool that supports 445 file formats in 13 languages.

archiver conversion converter document-conversion extractor file-converter file-sharing format image multilingual ocr ocr-recognition pdf-converter php server virustotal

Last synced: 28 Mar 2025

https://github.com/rdvojmoc/DinkToPdf

C# .NET Core wrapper for wkhtmltopdf library that uses Webkit engine to convert HTML pages to PDF.

html net-core net-standard pdf pdf-converter wkhtmltopdf

Last synced: 16 Mar 2025

https://github.com/spatie/pdf-to-text

Extract text from a pdf

pdf pdf-converter php text

Last synced: 13 May 2025

https://github.com/booktype/Booktype

Booktype is a free, open source platform that produces beautiful, engaging books formatted for print, Amazon, iBooks and almost any ereader within minutes.

books booktype django ebooks editing epub ibooks javascript pdf pdf-converter python restful

Last synced: 22 Mar 2025

https://github.com/adithya-s-k/marker-api

Easily deployable 🚀 API to convert PDF to markdown quickly with high accuracy.

api fastapi marker pdf-converter pdf-files pdf-parser pdf-parsing rest-api

Last synced: 16 May 2025

https://github.com/elliotblackburn/mdpdf

Markdown to PDF command line app with support for stylesheets

converter javascript markdown node pdf pdf-converter

Last synced: 06 Mar 2025

https://github.com/vladholubiev/serverless-libreoffice

Run LibreOffice in AWS Lambda to create PDFs & convert documents

aws-lambda javascript libreoffice nodejs pdf pdf-conversion pdf-converter serverless terraform

Last synced: 05 Apr 2025

https://github.com/drmingler/docling-api

Easily deployable and scalable backend server that efficiently converts various document formats (pdf, docx, pptx, html, images, etc) into Markdown. With support for both CPU and GPU processing, it is Ideal for large-scale workflows, it offers text/table extraction, OCR, and batch processing with sync/async endpoints.

api fastapi markdown-parser pdf-chatbot pdf-conversion pdf-converter pdf-parser pdf-parsing pdf-to-markdown

Last synced: 16 May 2025

https://github.com/bitcrowd/chromic_pdf

Convenient HTML to PDF/A rendering library for Elixir based on Chrome & Ghostscript

chrome-devtools chrome-headless invoice pdf pdf-converter pdf-generation

Last synced: 30 Mar 2025

https://github.com/lucasrla/remarks

Extract annotations (highlights and scribbles) from PDF, EPUB, and notebooks marked with reMarkable tablets. Export to Markdown, PDF, PNG, SVG

annotations epub highlighting markdown obsidian ocr ocrmypdf pdf pdf-converter pymupdf remarkable-tablet roamresearch svg-images zotero

Last synced: 05 Apr 2025

https://github.com/abarker/pdfCropMargins

pdfCropMargins -- a program to crop the margins of PDF files

crop cropper pdf pdf-converter pdf-document-processor python

Last synced: 26 Mar 2025

https://github.com/hellerbarde/stapler

A small utility making use of the pypdf library to provide a (somewhat) lighter alternative to pdftk

pdf pdf-converter pdf-document-processor python

Last synced: 06 Apr 2025

https://github.com/nickrussler/email-to-pdf-converter

Converts email files (eml, msg) to pdf

email eml java msg outlook pdf pdf-converter

Last synced: 06 May 2025

https://github.com/adrg/go-wkhtmltopdf

Handcrafted Go bindings for wkhtmltopdf and high-level HTML to PDF conversion interface

bindings converter go golang golang-library golang-package html html-to-pdf library native pdf pdf-conversion pdf-converter wkhtmltopdf wkhtmltox

Last synced: 16 May 2025

https://github.com/hajareshyam/pdf-creator-node

This package is used to generate HTML to PDF in Nodejs

html-pdf html-template node pdf pdf-converter pdf-creator pdf-generation

Last synced: 08 Apr 2025

https://github.com/shelfio/aws-lambda-libreoffice

Utility to work with Docker version of LibreOffice in Lambda

aws-lambda libreoffice node-module nodejs npm-package pdf-converter pdf-generation serverless

Last synced: 13 Mar 2025

https://github.com/opengovsg/pdf2md

A PDF to Markdown converter

markdown pdf-converter

Last synced: 02 Dec 2024

https://github.com/jmrozanec/pdf-converter

A Java library to convert .pdf files into .epub, .txt, .png, .jpg, .zip formats.

epub hacktoberfest jpg pdf pdf-converter png zip

Last synced: 03 Apr 2025

https://github.com/ds4sd/deepsearch-toolkit

Interact with the Deep Search platform for new knowledge explorations and discoveries

accelerated-discovery deepsearch knowledge-extraction knowledge-graph nlp pdf-converter python rag semantic-retrieval

Last synced: 15 May 2025

https://github.com/andytango/mupdf-js

📰 Yet another Webassembly PDF renderer for node and the browser

mupdf pdf pdf-converter pdf-viewer wasm webassembly

Last synced: 05 Apr 2025

https://github.com/spiritix/php-chrome-html2pdf

A PHP library for converting HTML to PDF using Google Chrome

html2pdf htmltopdf pdf pdf-converter php puppeteer puppeteer-pdf

Last synced: 15 May 2025

https://github.com/sunshineplan/imgconv

Golang image format convert, resize and watermark.

format-converter golang image-processing pdf-converter resize watermark

Last synced: 04 Apr 2025

https://github.com/yumauri/gotenberg-js-client

A simple JS/TS client for interacting with a Gotenberg API

gotenberg pdf pdf-converter

Last synced: 06 Apr 2025

https://github.com/nahidulhasan/laravel-pdf

A Simple package for easily generating PDF documents from HTML. This package is specially for laravel but you can use this without laravel.

easily html2pdf laravel laravel-5-package laravel-html2pdf laravel-package laravel-pdf laravel5 package pdf pdf-converter pdf-generation wkhtmltopdf

Last synced: 10 Apr 2025

https://github.com/gnd/archive_downloader

A node.js book downloader from Archive.org

archive-dot-org ebook ebook-process ocr pdf pdf-converter

Last synced: 22 Mar 2025

https://github.com/arminveres/md-pdf.nvim

Preview markdown files and convert to PDF inside Neovim!

lua markdown nvim nvim-plugin pdf pdf-converter previewer

Last synced: 01 Apr 2025

https://github.com/dmester/pdftosvg.net

Fully managed .NET library for converting PDF files to SVG.

converter dotnet pdf pdf-converter pdf2svg pdftosvg svg svg-converter

Last synced: 09 Apr 2025

https://github.com/betterwrite/pdfeasy

📕 A JavaScript Client/Server Side PDF-Generator based in PDFKit

pdf pdf-converter pdf-document pdf-generation pdfkit typescript

Last synced: 22 Nov 2024

https://github.com/eiaserinnys/pdf2md

This project, pdf2md, transforms academic paper PDF files into digestible text files. By analyzing the layout of the PDF file, the application restructures paragraphs and translates desired content. The final result is a conveniently exported text file.

pdf-converter translation

Last synced: 21 Apr 2025

https://github.com/ladykerr/pdf-to-excel

convert PDF to excel using this python script 🐍

pdf-converter python-3

Last synced: 12 Apr 2025

https://github.com/iamarunbrahma/pdf-to-markdown

Conversion of PDF documents to structured Markdown, optimized for Retrieval Augmented Generation (RAG) and other NLP tasks. Extract text, tables, and images with preserved formatting for enhanced information retrieval and processing.

document-conversion document-processing information-retrieval pdf-converter pdf-extraction pdf-parsing pdf-to-markdown python rag retrieval-augmented-generation text-extraction

Last synced: 10 Apr 2025

https://github.com/matthsena/alchemark

Your files ready for Gen AI ✨🚀 AlcheMark is a lightweight PDF to Markdown, alchemical-inspired toolkit that transmutes PDF documents into structured Markdown pages—complete with rich metadata and named‐entity annotations—empowering you to uncover insights page by page.

markdown pdf-converter pdf2md

Last synced: 13 May 2025

https://github.com/fahdmirza/doclingwithollama

Docling with Ollama - RAG on Local Files with Local Models

docling ollama pdf-converter retrieval-augmented-generation

Last synced: 13 Apr 2025

https://github.com/calvintychan/serverless-html-pdf

Convert HTML to PDF thru a lambda function using PhantomJS.

pdf-converter phantomjs rasterize serverless

Last synced: 02 May 2025

https://github.com/pankajr141/pdf2jpg

Utility to convert PDF into JPG files

pdf-converter pdf-document-processor

Last synced: 05 Mar 2025

https://github.com/miikanissi/zebrafy

Python library for converting PDF and images to and from Zebra Programming Language (ZPL).

gfa pdf-converter python zebra-printer zebra-programming-language zpl zpl-programming-language

Last synced: 05 Apr 2025

https://github.com/vladocar/pdfsave

Convert websites into readable PDFs

cli node node-js nodejs pdf pdf-converter pdf-generation readability readable

Last synced: 20 Nov 2024

https://github.com/chen0040/cs-pdf-to-image

a simple library to convert pdf to image for .net

csharp library pdf-converter ps

Last synced: 16 Dec 2024

https://github.com/ashutoshvarma/pyxpdf

Fast and memory-efficient Python PDF Parser based on xpdf sources

cython pdf pdf-converter pdf-parser pdfparser pdftohtml pdftopng pdftotext python xpdf xpdf-reader

Last synced: 12 Apr 2025

https://github.com/compdfkit/compdfkit-api-java

A Java component library for integrating with ComPDFKit API to build a PDF Viewer and Editor.

api compdfkit-api java pdf pdf-converter pdf-document pdf-editor pdf-viewer

Last synced: 30 Apr 2025

https://github.com/BetaHuhn/vercel-pdf-converter

📄▲ Vercel function which generates PDFs from Webpages.

pdf pdf-converter vercel vercel-pdf-converter vercel-serverless-functions

Last synced: 26 Mar 2025

https://github.com/betahuhn/vercel-pdf-converter

📄▲ Vercel function which generates PDFs from Webpages.

pdf pdf-converter vercel vercel-pdf-converter vercel-serverless-functions

Last synced: 20 Nov 2024

https://github.com/coenttb/swift-html-to-pdf

A Swift package providing an easy-to-use interface for concurrently printing HTML to PDF on iOS and macOS.

html pdf pdf-converter swift

Last synced: 28 Mar 2025

https://github.com/adamfoneil/questpdfutil

A lightweight HTML to PDF conversion library for QuestPDF

html pdf-converter

Last synced: 23 Nov 2024

https://github.com/austonianai/pdf-to-json

Convert your PDFs to custom JSON data using AI.

gpt-3 json json-converter nextjs openai pdf pdf-converter react typescript

Last synced: 02 Apr 2025

https://github.com/hoangtran0410/saoke_yagi

Sao kê của Mặt Trận Tổ Quốc Việt Nam (MTTQ) về việc hỗ trợ đồng bào sau bão Yagi

pdf-converter pdf-to-csv pdf-to-json

Last synced: 05 Dec 2024

https://github.com/mnuessler/docker-pdftk

Docker image for pdftk

docker-image pdf-converter pdftk

Last synced: 23 Mar 2025

https://github.com/zentered/rust-pdf-to-png-service

Pdf to Png conversion service in Rust

libvips pdf-converter pdf-to-png pdfium rust

Last synced: 05 Apr 2025

https://github.com/fluidsonic/fluid-pdf

Easy PDF generation with HTML & CSS using Chromium or Google Chrome

kotlin pdf pdf-converter pdf-generation

Last synced: 14 Apr 2025

https://github.com/compdfkit/compdfkit-api-php

A PHP component library for integrating with ComPDFKit API to build a PDF Viewer and Editor.

api compdfkit-api pdf pdf-converter pdf-document pdf-editor pdf-viewer php

Last synced: 30 Apr 2025

https://github.com/christiaanderidder/questpdf.markdown

QuestPDF.Markdown allows rendering markdown into a QuestPDF document

csharp markdig markdown pdf pdf-converter questpdf

Last synced: 03 May 2025

https://github.com/starkblaze01/pdf-to-image

Simple Python Command Line tool to convert PDF to Image.

hacktoberfest image pdf pdf-converter pdftoimage pil pip-package poppler

Last synced: 16 Mar 2025

https://github.com/spiritix/html-to-pdf

Convert HTML markup into beautiful PDF files using the famous wkhtmltopdf library.

html html-to-pdf htmltopdf pdf pdf-conversion pdf-converter php wkhtmltopdf

Last synced: 18 Mar 2025

https://github.com/compdfkit/compdfkit-api-python

A Python component library for integrating with ComPDFKit API to build a PDF Viewer and Editor.

api compdfkit-api pdf pdf-converter pdf-document pdf-editor pdf-viewer python

Last synced: 30 Apr 2025

https://github.com/vicajilau/drag-pdf

Application to create PDF documents by combining images, PDF and document scanner developed with Flutter.

pdf-converter pdf-document pdf-generation pdf-viewer

Last synced: 13 Feb 2025

https://github.com/foosel/cardfoldr

Convert a PDF of card grids as found in Print'n'play games into a PDF with the cards arranged side by side with their backs, with a foldline right down the middle of the page

browser pdf-converter pnp print-and-play tool

Last synced: 25 Mar 2025

https://github.com/compdfkit/compdfkit-api-.net

A .NET component library for integrating with ComPDFKit API to build a PDF Viewer and Editor.

api compdfkit-api csharp dotnet dotnet-core pdf pdf-converter pdf-document pdf-editor pdf-viewer

Last synced: 12 May 2025

https://github.com/adrienjoly/hsbcstatementparser

Transforms PDF bank statements from HSBC into a list of operations in JSON or TSV format.

bank-statement conversion csv-export json-export pdf-converter pdf-parser tsv-format

Last synced: 23 Apr 2025

https://github.com/flaribbit/paper-translator

论文翻译器!使用百度翻译把 pdf 格式的英文论文转换成中英文对照的 html 文件!

paper pdf pdf-converter translator

Last synced: 12 Apr 2025

https://github.com/compdfkit/compdfkit-api-samples

ComPDFKit PDF API is organized around the REST standard and supports various programming languages with rich PDF features, including conversion, document editor, data extraction, and so forth.

api curl dotnet java javascript pdf-converter pdf-editor pdf-viewer php python rest-api swift

Last synced: 09 Feb 2025

https://github.com/stevencohn/pdf2image

Export PDF pages to individual JPG files

csharp pdf-converter

Last synced: 13 Feb 2025