Projects in Awesome Lists tagged with pdf-files
A curated list of projects in awesome lists tagged with pdf-files .
https://github.com/pdfcpu/pdfcpu
A PDF processor written in Go.
go golang golang-library pdf pdf-files pdf-lib pdf-processor pdflib processor
Last synced: 09 Sep 2025
https://github.com/uglytoad/pdfpig
Read and extract text and other content from PDFs in C# (port of PDFBox)
alto-xml csharp document-analysis hocr layout-analysis netstandard page-xml pdf pdf-document pdf-document-processor pdf-extractor pdf-files pdf-generation pdfbox
Last synced: 10 May 2025
https://github.com/UglyToad/PdfPig
Read and extract text and other content from PDFs in C# (port of PDFBox)
alto-xml csharp document-analysis hocr layout-analysis netstandard page-xml pdf pdf-document pdf-document-processor pdf-extractor pdf-files pdf-generation pdfbox
Last synced: 24 Mar 2025
https://github.com/pdf-rs/pdf
Rust library to read, manipulate and write PDF files.
Last synced: 13 May 2025
https://github.com/swati4star/images-to-pdf
An app to convert images to PDF file!
convert-images hacktoberfest jpg jpg-images pdf pdf-converter pdf-files playstore text-to-pdf
Last synced: 15 May 2025
https://github.com/Swati4star/Images-to-PDF
An app to convert images to PDF file!
convert-images hacktoberfest jpg jpg-images pdf pdf-converter pdf-files playstore text-to-pdf
Last synced: 29 Mar 2025
https://github.com/adithya-s-k/marker-api
Easily deployable 🚀 API to convert PDF to markdown quickly with high accuracy.
api fastapi marker pdf-converter pdf-files pdf-parser pdf-parsing rest-api
Last synced: 16 May 2025
https://github.com/boazsegev/combine_pdf
A Pure ruby library to merge PDF files, number pages and maybe more...
pdf pdf-files pdf-generation pdf-merge ruby
Last synced: 13 May 2025
https://github.com/Krasjet/pdf.tocgen
A CLI toolset to generate table of contents for PDF files automatically.
cli pdf pdf-document pdf-files pymupdf scraping table-of-contents toc-generator
Last synced: 15 May 2025
https://github.com/krasjet/pdf.tocgen
A CLI toolset to generate table of contents for PDF files automatically.
cli pdf pdf-document pdf-files pymupdf scraping table-of-contents toc-generator
Last synced: 08 Apr 2025
https://github.com/unidoc/unidoc
This repository has moved! https://github.com/unidoc/unipdf
golang pdf pdf-files pdf-invoice pdf-library text-extraction unidoc
Last synced: 01 Apr 2025
https://github.com/chinapandaman/pypdfform
:fire: The Python library for PDF forms.
pdf pdf-document pdf-document-processor pdf-files pdf-forms pdf-generation pdf-merge pdf-merger pdffiller python python-3 python-library python-package python-programming python-project python3
Last synced: 14 May 2025
https://github.com/ropensci/pdftools
Text Extraction, Rendering and Converting of PDF Documents
pdf-files pdf-format pdftools poppler poppler-library r r-package rstats text-extraction
Last synced: 27 Aug 2025
https://github.com/chunyenhuang/hummusrecipe
A powerful PDF tool for NodeJS based on HummusJS.
nodejs overlay-pdf pdf pdf-files pdf-generation pdf-manipulation pdf-modification pdf-parsing
Last synced: 16 May 2025
https://github.com/chunyenHuang/hummusRecipe
A powerful PDF tool for NodeJS based on HummusJS.
nodejs overlay-pdf pdf pdf-files pdf-generation pdf-manipulation pdf-modification pdf-parsing
Last synced: 05 Apr 2025
https://github.com/podofo/podofo
A C++17 PDF manipulation library
cplusplus cpp pdf pdf-documents pdf-files pdf-generation
Last synced: 05 Apr 2025
https://github.com/michaelrsweet/htmldoc
HTML Conversion Software
encryption html html-doc html-files pdf pdf-files postscript
Last synced: 06 Apr 2025
https://github.com/benwiggy/PDFsuite
Python scripts, Automator Services and Quartz Filters for MacOS (OS X) that create, manipulate, and query PDF files
apple booklet macos pdf-files pdf-service python quartz-filters
Last synced: 27 Mar 2025
https://github.com/dealfonso/sapp
Simple and Agnostic PDF Document Parser in PHP - sign PDF docs using PHP
acrobat agnostic-pdf-parser digital-signature digital-signatures fpdi multiple-signatures pdf pdf-document pdf-files pdf-generation pdf-objects php sapp setapdf signatures tcpdf
Last synced: 07 Oct 2025
https://github.com/arminstraub/krop
A simple graphical tool to crop the pages of PDF files, written in Python/Qt
Last synced: 26 Mar 2025
https://github.com/mhucka/zowie
Adds Zotero "select" links to attachment files in a Zotero database on macOS, so that outside of Zotero, you can find the bibliographic entry to which a file belongs. (Only works for local storage, not linked attachments.)
extended-attributes file-metadata finder macos metadata pdf pdf-files zotero zotero-api zotero-link
Last synced: 20 Aug 2025
https://github.com/muriloventuroso/pdftricks
A simple, efficient application for small manipulations in PDF files using Ghostscript.
appcenter elementaryos ghostscript pdf pdf-files
Last synced: 30 Mar 2025
https://github.com/txn2/txpdf
HTML to PDF microservice
docker docker-image golang-application golang-server microservice pdf pdf-files pdf-generation webapi wkhtmltopdf
Last synced: 11 Apr 2025
https://github.com/r0wi-dev/workflow_ocr
This is a Nextcloud Workflow App which enables you to process files via OCR on serverside.
nextcloud nextcloud-workflow-ocr ocr pdf-files
Last synced: 10 Apr 2025
https://github.com/naivehobo/pdfviewer
PDFViewer is a GUI tool, written using python3 and tkinter, which lets you view PDF documents.
pdf pdf-document pdf-document-processor pdf-files pdf-viewer tkinter tkinter-graphic-interface tkinter-gui tkinter-library tkinter-python
Last synced: 30 Apr 2025
https://github.com/R0Wi-DEV/workflow_ocr
This is a Nextcloud Workflow App which enables you to process files via OCR on serverside.
nextcloud nextcloud-workflow-ocr ocr pdf-files
Last synced: 07 Apr 2025
https://github.com/MrBotDeveloper/PDF-Bot
A bot for PDF for doing Many Things....
bot datastore docker gcp heroku heroku-deployment heroku-pdf-bot mrbotdeveloper mrdeveloper pdf-files pdf-manipulation pdf-manipulation-utitilies python python-telegram-bot telegram telegram-bot telegram-pdf-bot translation-files
Last synced: 22 Jul 2025
https://github.com/michelecotrufo/pdf2bib
A python library/command-line tool to quickly and automatically generate BibTeX data starting from the pdf file of a scientific publication.
arxiv bibtex bibtex-entry bibtex-parser doi extract pdf pdf-files python
Last synced: 29 Oct 2025
https://github.com/dansoftowner/pdfviewerfx
A pdf viewer library for your javaFX application
java java-gui java-library java-pdf java8 javafx javafx-desktop-apps javafx-library javafx-pdf javafx-project javascript pdf pdf-displayer pdf-document pdf-files pdf-reader pdf-renderer pdf-viewer pdfjs pdfjs-dist
Last synced: 04 Oct 2025
https://github.com/siddhantsadangi/pdf-workdesk
A Streamlit-powered application that provides a user-friendly interface for editing PDF documents.
pdf pdf-document pdf-document-processor pdf-files pdf-viewer pdfkit python streamlit webapp
Last synced: 16 Mar 2025
https://github.com/hrbrmstr/pdfbox
📄◻️ Create, Maniuplate and Extract Data from PDF Files (R Apache PDFBox wrapper)
pdf-document pdf-files pdfbox pdfbox-wrapper r r-cyber rstats
Last synced: 21 Mar 2025
https://github.com/gorgitko/molminer
Python library and command-line tool for extracting compounds from scientific literature. Written in Python.
chemical-entities chemspider extract-entities named-entity-recognition natural-language-processing ner ocsr pdf-files pubchem python
Last synced: 07 Apr 2025
https://github.com/codedotjs/interviewcake
:unicorn: Read Interview Cake questions and solutions.
algorithms challenge challenges competitive-programming datastructures interview interview-practice interviewcake-questions javascript pdf pdf-files questions solve
Last synced: 18 Aug 2025
https://github.com/mtgrosser/pdfunite
Merge PDF files with Ruby, no Java required
joining-pdf-files pdf pdf-files pdfunite ruby
Last synced: 13 Apr 2025
https://github.com/JavierCanon/PDF-OEditor.js
📑 HTML5 Javascript Online Offline Browser Editor 📜 for PDF files for insert sign, images, photos, comments or annotations
asp-net-core html5 insert-sign javascript js mozilla-pdf net-core pdf pdf-document pdf-files pdf-oeditor pdf-sign pdf-signature pdf-signer pdf-viewer
Last synced: 30 Jul 2025
https://github.com/javiercanon/pdf-oeditor.js
📑 HTML5 Javascript Online Offline Browser Editor 📜 for PDF files for insert sign, images, photos, comments or annotations
asp-net-core html5 insert-sign javascript js mozilla-pdf net-core pdf pdf-document pdf-files pdf-oeditor pdf-sign pdf-signature pdf-signer pdf-viewer
Last synced: 14 Apr 2025
https://github.com/cajuncoding/apachefop.serverless
A ready to use serverless implementation of Apache FOP via Azure Functions. This provides a micro-service for dynamically rendering quality PDF binary outputs from XSL-FO source using Apache FOP. When combined with the ease and simplicity of Azure Functions this project is a powerful, efficient, and scalable PDF Reporting Service that generates high quality, true paged media, reports for any environment and any client technology (.Net, NodeJS/JavaScript, Ruby, Mobile iOS/Android, Powershell, even Windows/Mac apps, etc.)!
apache-fop azure-functions fonet fop micro-service microservice pdf pdf-as-a-service pdf-files pdf-generation pdf-rendering pdf-templating reports reportserver reportservice serverless serverless-architectures xsl-fo xslfo xslt
Last synced: 21 Mar 2025
https://github.com/cajuncoding/pdftemplating.xslfo
This is a C# .NET solution that tests and illustrates the capabilities of Xsl-FO for dynamically generating PDF documents. It provides a way to generate the XSL-FO source xml from Xslt or Razor Templating to address various development teams. And it provides basic WinForms Application that can be used to aid in developing the Xslt and Xsl-FO markup.
apache-fop azure-functions csharp fonet fop java microservice pdf-files pdf-generation pdf-rendering pdf-reports pdf-templating razor-templates razor-views serverless template-engine xsl-fo xslfo xslt xslt-template
Last synced: 16 Mar 2025
https://github.com/wbthomason/pdf-scribe.nvim
Neovim plugin for importing annotations and metadata from PDFs
annotations luajit neovim neovim-plugin notes nvim pdf pdf-files pdf-scribe poppler
Last synced: 29 Oct 2025
https://github.com/bronson/pdfdir
Utilities to operate on lots of PDF files
Last synced: 22 Aug 2025
https://github.com/scriptim/pdf-meta-editor
Interactive cli for changing metadata of pdf files
cli command-line command-line-interface command-line-tool document exif exif-metadata exiftool interactive interactive-cli meta-data metadata nodejs npm npm-package pdf pdf-document pdf-files pdf-meta-editor tool
Last synced: 24 Jun 2025
https://github.com/Scriptim/pdf-meta-editor
Interactive cli for changing metadata of pdf files
cli command-line command-line-interface command-line-tool document exif exif-metadata exiftool interactive interactive-cli meta-data metadata nodejs npm npm-package pdf pdf-document pdf-files pdf-meta-editor tool
Last synced: 06 Apr 2025
https://github.com/george-gca/ai_papers_scrapper
Download papers pdfs and other info from main AI conferences
artificial-intelligence conferences pdf-files python scraping scrapy
Last synced: 21 Mar 2025
https://github.com/edoardottt/multi-pdf-finder
Are you looking for a word in many pdf files? Do it one time. ⚡
bash find finder multi-pdf-finder pdf pdf-document pdf-files pdf-scraping python3 script search-algorithm search-in-text searching-algorithms text word wordsearch
Last synced: 21 Aug 2025
https://github.com/kairavkkp/merge-pdf
My first PyPi Package. Merge Image and PDF files using customizations within a folder using the Command line.
epubs hacktoberfest hacktoberfest2021 merge-images merge-pdf merge-pdf-images pdf-files pdfs
Last synced: 19 Mar 2025
https://github.com/piomin/sample-jasperreport-boot
sample application that show how to generate large pdf files using spring-boot with jasperreports
jasperreports java pdf pdf-files pdf-generation performance spring-boot
Last synced: 10 Aug 2025
https://github.com/josee9988/compress-pdfs
A python CLI script to 𝗰𝗼𝗺𝗽𝗿𝗲𝘀𝘀 📦 all the 𝗣𝗗𝗙 files 𝗿𝗲𝗰𝘂𝗿𝘀𝗶𝘃𝗲𝗹𝘆 in a directory using the iLovePDF technology 🥰
compress compress-files compress-pdf compressed compression compressor compressors pdf pdf-compression pdf-converter pdf-document-processor pdf-files python python-3 python-compressing python-pdf python3 python3-script python3-scripts size-reduction
Last synced: 30 Oct 2025
https://github.com/bbc-esq/pyside6_pdf_viewer
Simple PDF Viewer Using Pyside6 that can be run standalone or included in a larger program.
pdf pdf-document pdf-files pdf-viewer pdfjs pyside6
Last synced: 01 May 2025
https://github.com/r3dhulk/pdf-exif
get metadata from pdf
blackhat-python blackhatpython cyber-security cybersecurity ethical-hacking exif-data exif-metadata exiftool getinfo information-gathering metadata pdf pdf-exif pdf-files pdf-metadata python pythonforethicalhacking security
Last synced: 25 Aug 2025
https://github.com/zanysoft/laravel-pdf
PDF document generator from HTML
html-to-pdf html-to-pdf-php laravel laravel-pdf mpdf pdf pdf-converter pdf-document pdf-files pdf-generation php
Last synced: 14 Sep 2025
https://github.com/lstedmanfalls/translatedocument
TypeScript / Node / Express web app to translate documents and export translation
pdf-files translation typescript
Last synced: 04 May 2025
https://github.com/sameerkumar18/pdfgeneratorapi-python
PDFGeneratorAPI Python Wrapper
package pdf-document pdf-files pdf-generation pdf-merger pip python-library python3 wrapper-library
Last synced: 10 Jul 2025
https://github.com/orchetect/pdfgadget
Batch PDF operations for Swift
pdf pdf-document-processor pdf-files pdf-merger swift
Last synced: 23 Apr 2025
https://github.com/yifaneye/1pdf
💫 CLI tool for combining all PDF files in a directory into 1 PDF file 👉 sudo npm i -g 1pdf
combine merge pdf pdf-document pdf-files
Last synced: 04 Aug 2025
https://github.com/kouisamine/compare-two-pdf-files
Comparing PDFs can be useful in various scenarios, such as document version control, content verification, and collaboration.
comparator compare compare-pdf comparison js online pdf pdf-comparison pdf-diff pdf-difference pdf-document pdf-files php script source-code tools
Last synced: 23 Oct 2025
https://github.com/parzibyte/extraer-texto-imagenes-pdf-php
Ejemplos de uso de PdfParser para extraer texto e imágenes de un documento PDF con PHP
Last synced: 12 Apr 2025
https://github.com/shosta/merge-pdfs
Merge Pdfs automatically in a folder based on the file names.
pdf-files python scanner utility
Last synced: 07 Apr 2025
https://github.com/zeeshanahmad4/nlp-pdf-minning-extracting-text-from-pdf
NLP Pdf Minning Extracting text from pdf
extract-text pdf pdf-converter pdf-document-processor pdf-files pdf-format pdf-text-extraction pdfcon pdfkit pdftohtml pdftoimage pdftools pdftotext python text-extraction
Last synced: 01 Apr 2025
https://github.com/thomasvanholder/browserless
A Ruby wrapper for the Browserless PDF API with support for modern CSS such as TailwindCSS
pdf pdf-converter pdf-document pdf-document-processor pdf-files pdf-generation
Last synced: 25 Apr 2025
https://github.com/txuswashere/snes-manuals
Collection of All US SNES Manuals and All PAL Exclusive manuals.
documentation documents game-manual gamemanuals manual manuals pdf pdf-files pdfs snes snes-manuals snesmanuals super-nintendo super-nintendo-entertainment-system supernintendo
Last synced: 05 Jan 2026
https://github.com/dinoosauro/pdf-pointer
Display PDFs with a pointer, make quick annotations (that automatically disappear), and zoom them
pdf pdf-annotation pdf-document pdf-files pdf-js pdf-pointer pdf-viewer pdfjs pdfjs-viewer
Last synced: 08 Sep 2025
https://github.com/moindalvs/how_to_convert_pdf_to_docx_in_python
How to convert .pdf extension files into .docx file in python?
docx pdf pdf-converter pdf-files pdf2docx
Last synced: 02 Jul 2025
https://github.com/txuswashere/gamaker
www.gamaker.org
cyber-security cyber-security-team cyber-threat-intelligence hacking maker pdf pdf-document pdf-files security-audit security-testing security-vulnerability seguridad seguridad-informatica
Last synced: 05 Jan 2026
https://github.com/drmccoy/pdftextorizer
Interactively extract text from multi-column PDFs
gui pdf pdf-extractor pdf-files pdf2text pdftotext pyqt5 qt5
Last synced: 07 Jan 2026
https://github.com/jabalazs/pdfly
Basic CLI tool for extracting and merging PDFs
Last synced: 22 Feb 2025
https://github.com/tfawcett/rename_papers
rename_papers is a simple program to extract the title from a PDF file, manipulate it, and rename the file.
Last synced: 07 Apr 2025
https://github.com/xxdashtixx/pdfmultitool
PDF Multi-Tool is a .NET Framework Windows Forms application written in C# that provides functionalities for PDF conversions using the GhostScript library, Supporting conversions between various file formats and offering a user-friendly interface for managing and configuring conversion tasks.
c-sharp csharp dotnet ghostscript graphical-user-interface gui open-source pdf pdf-converter pdf-document pdf-files pdf-generation pdf-viewer user-interface windows-forms
Last synced: 15 Jun 2025
https://github.com/katahiromz/python-cheatsheet-japanese
Pythonカンニングペーパー(日本語)
cheat-sheets cheatsheets japanese pdf-files python svg-files
Last synced: 02 Mar 2025
https://github.com/geo-y20/enhanced-learning-experience
IntelliLearn is a FastAPI-based application designed to process and transcribe audio and video files into text using the Whisper model. The application also supports processing PDF files to extract and summarize their content.
chat-application chatgpt educational-project fastapi groq-api huggingface lama llm pdf-files platform python speech-to-text text-summarization transformer whisper word2vec wordembedding
Last synced: 06 Apr 2025
https://github.com/r0wi-dev/workflow_ocr_backend
Alternative backend for https://github.com/R0Wi-DEV/workflow_ocr
nextcloud nextcloud-workflow-ocr ocr pdf-files
Last synced: 25 Feb 2025
https://github.com/birdo1221/htb-writeup
Write-Ups, Tools and Scripts for Hack The Box
challenges hack-the-box hackthebox hackthebox-writeups machines pdf-files script scripts sherlocks write-ups writer writing
Last synced: 05 Mar 2025
https://github.com/najmiter/pdf
Free PDF editor (no limits – no server involved)
merge pdf pdf-editor pdf-files pdf-merge ts typescript web
Last synced: 03 Sep 2025
https://github.com/jpleorx/pdf-scan
Experimenting with iTextPdf, PdfBox, Tesseract
itext itextpdf java ocr pdf pdf-files pdf-generation pdf-reader pdfbox pdfbox2 tess4j tesseract tesseract-ocr
Last synced: 25 Oct 2025
https://github.com/txuswashere/fundaciontelefonica.com
https://conectaempleo-formacion.fundaciontelefonica.com/
cyber-security pdf-document pdf-files security security-audit security-tools security-vulnerability seguridad seguridad-informatica telefonica
Last synced: 05 Jan 2026
https://github.com/kavex/pdf-combine
Allows you to combine multi pdfs into into one pdf. Add, Rearrange, or Delete Pages.
add combine combiner edit export pdf pdf-converter pdf-document pdf-files pdfs python python-3 python-app python-script python3 remove
Last synced: 07 Mar 2025
https://github.com/hugoamoraes/contadorpdf
A small system that counts the number of pages in a PDF
Last synced: 01 Mar 2025
https://github.com/justnunuz/corrupt-pdf
This repository contains materials from my talks at PyCon Zimbabwe 2024 and PyCon Namibia 2025, where I delve into the often-overlooked security risks of PDF files in cybersecurity and forensics.
forensics pdf pdf-files vulnerabilities
Last synced: 17 Mar 2025
https://github.com/ivan-ayub97/encorpdf
EncorPDF Viewer es una aplicación de visualización eficiente diseñada para ver documentos PDF.
pdf pdf-files pdf-viewer pymupdf pyqt5 pyqt5-desktop-application python python3 viewer
Last synced: 23 Jul 2025
https://github.com/ejdecena/my-technical-docs
My Technical Docs is a repository where additional material for technical computer learning is collected.
docs documentation pdf-document pdf-files
Last synced: 30 Mar 2025
https://github.com/secret-guest/font_lister
List the fonts that are installed on your computer as a PDF list with corresponding images.
font fonts front-end-development frontend list listes lists-python listsort otf otf-fonts pdf pdf-files pdf-generation police python ttf ttf-fonts
Last synced: 25 Feb 2025
https://github.com/secret-guest/pdf_printer
Make printable A4 PDF from images
a4 images javascript js pdf pdf-document pdf-files pdf-generation pdfjs print python secret-guest web-application
Last synced: 25 Feb 2025
https://github.com/dinoosauro/merge-pdf
Merge PDFs from the command line.
pdf pdf-document pdf-files pdf-generation pdf-merge pdf-merger pdf-merging
Last synced: 05 Jul 2025
https://github.com/ivan-ayub97/encorpdf_es
EncorPDF Viewer is a sleek and efficient application designed for viewing PDF documents. Tailored for those who simply want to open and navigate PDF files without unnecessary features, distractions, or intrusive ads, it offers a straightforward and hassle-free user experience.
pdf pdf-files pdf-viewer pymupdf pyqt5 pyqt5-desktop-application python python3 viewer
Last synced: 14 Jul 2025
https://github.com/wesleya0101/pdf-split-pro
PDF-Split-PRO é uma ferramenta para separar páginas de contracheques em PDFs, renomeando-as automaticamente com base em palavras-chave. Criada para agilizar o processo de separação, nomeação e envio de documentos no RH, a ferramenta melhora a produtividade e resolve tarefas repetitivas.
empresa pdf pdf-document pdf-files python recursos-humanos
Last synced: 05 Apr 2025
https://github.com/rockerzxy/pdfiler
cli-утилита для создания PDF из изображений
automation pdf pdf-files pdfiler python
Last synced: 28 Mar 2025
https://github.com/tonygrif/pdf-finder
A Python program to locate links to PDFs found within a webpage from the command line
docker pdf-files python web-scraping
Last synced: 28 Feb 2025
https://github.com/madmax2506/pdf-tools
Operations to handle some operations on pdf files.
pdf pdf-files pdf-generation pdf-merge pdf-rotate pdf-split
Last synced: 28 Apr 2025
https://github.com/grv96/pypdf2_structures
This library writes text representations of the object structures that make the pages and fields of PDF files in PyPDF2.
pdf pdf-document pdf-files pypdf pypdf2 python
Last synced: 28 Oct 2025
https://github.com/feliwir/charta-pdf
A modern C++ library for processing PDF documents.
cpp17 pdf pdf-files pdf-generation
Last synced: 13 Dec 2025
https://github.com/403errors/ai-docparser
An application framework developed using the latest AI technologies to extract the values of specific pre-defined keys from a given PDF document. Also generating a document summary using the key & values extracted in the while doing so.
automation csv-export nlp pdf-files python3 regex reinforcement-learning spacy
Last synced: 07 Oct 2025
https://github.com/omaxel/pdf-overlap
Overlaps two PDF files.
csharp itextsharp pdf pdf-files pdf-overlap
Last synced: 28 Dec 2025