Projects in Awesome Lists tagged with pdfbox
A curated list of projects in awesome lists tagged with pdfbox .
https://github.com/danfickle/openhtmltopdf
An HTML to PDF library for the JVM. Based on Flying Saucer and Apache PDF-BOX 2. With SVG image support. Now also with accessible PDF support (WCAG, Section 508, PDF/UA)!
accessibility css html java pdf pdf-generation pdfbox svg
Last synced: 13 May 2025
https://github.com/uglytoad/pdfpig
Read and extract text and other content from PDFs in C# (port of PDFBox)
alto-xml csharp document-analysis hocr layout-analysis netstandard page-xml pdf pdf-document pdf-document-processor pdf-extractor pdf-files pdf-generation pdfbox
Last synced: 10 May 2025
https://github.com/UglyToad/PdfPig
Read and extract text and other content from PDFs in C# (port of PDFBox)
alto-xml csharp document-analysis hocr layout-analysis netstandard page-xml pdf pdf-document pdf-document-processor pdf-extractor pdf-files pdf-generation pdfbox
Last synced: 24 Mar 2025
https://github.com/jonathanlink/pdflayouttextstripper
Converts a pdf file into a text file while keeping the layout of the original pdf. Useful to extract the content from a table in a pdf file for instance. This is a subclass of PDFTextStripper class (from the Apache PDFBox library).
data-extraction extract java layout pdf pdfbox text
Last synced: 15 May 2025
https://github.com/JonathanLink/PDFLayoutTextStripper
Converts a pdf file into a text file while keeping the layout of the original pdf. Useful to extract the content from a table in a pdf file for instance. This is a subclass of PDFTextStripper class (from the Apache PDFBox library).
data-extraction extract java layout pdf pdfbox text
Last synced: 15 Mar 2025
https://github.com/hwding/pdf-unstamper
Remove textual watermark of any font, any encoding and any language with pdf-unstamper now!
command-line-tool pdf pdf-merge pdfbox stamp tool
Last synced: 02 Apr 2025
https://github.com/red6/pdfcompare
A simple Java library to compare two PDF files
Last synced: 14 Jan 2026
https://github.com/lebedov/python-pdfbox
Python interface to Apache PDFBox command-line tools.
Last synced: 09 Apr 2025
https://github.com/rostrovsky/pdf-table
Java utility for parsing PDF tabular data using Apache PDFBox and OpenCV
java-library java8 opencv opencv3 pdf-parsing pdfbox table tables
Last synced: 13 Aug 2025
https://github.com/phax/ph-pdf-layout
Java library for creating fluid page layouts with Apache PDFBox. Supporting multi-page tables, different page layouts etc.
java layout-engine pdf pdf-generation pdfbox rendering
Last synced: 06 Apr 2025
https://github.com/bobld/pdfpig.rendering.skia
Cross-platform C# library to render PDF as images
csharp dotnet image netframework netstandard pdf pdf-render pdf-renderer pdf-rendering pdfbox pdfpig render skia skiasharp
Last synced: 21 Feb 2026
https://github.com/hrbrmstr/pdfbox
📄◻️ Create, Maniuplate and Extract Data from PDF Files (R Apache PDFBox wrapper)
pdf-document pdf-files pdfbox pdfbox-wrapper r r-cyber rstats
Last synced: 21 Mar 2025
https://github.com/estevaocm/assinadorpdf
PDF document signer for ICP-Brasil certificates based on Demoiselle Signer, BouncyCastle and PDFBox.
bouncy-castle demoiselle digital-signature icp-brasil pdfbox
Last synced: 12 May 2025
https://github.com/aleksandr-m/struts2-pdfstream
A Struts2 plugin for creating PDF-s from HTML-s, JSP-s, FreeMarker templates and Apache Tiles definitions.
apache-tiles freemarker jsp openhtmltopdf pdf pdfbox struts2 struts2-plugin
Last synced: 13 Apr 2025
https://github.com/cityssm/pdfflattener
PDF Flattener - Secure PDF documents by making floating redactions and form entries permanent.
flatten java pdf pdf-flattener pdf-forms pdfbox redaction security
Last synced: 26 Jun 2025
https://github.com/cityssm/pdfFlattener
PDF Flattener - Secure PDF documents by making floating redactions and form entries permanent.
flatten java pdf pdf-flattener pdf-forms pdfbox redaction security
Last synced: 14 May 2025
https://github.com/shaido987/invivogen-printer-tool
For automatic download of specified TDS documents
automatisation pdf pdfbox printer printing web-scraping
Last synced: 09 Apr 2025
https://github.com/padam87/pdfbox-preflight
:rocket: PDF/X-1a and PDF/X-3 preflight (validation) with pdfbox
Last synced: 14 Jul 2025
https://github.com/leftshiftone/pdfscript
PDFScript is an open source software library for script based PDF generation.
Last synced: 14 Jan 2026
https://github.com/konik-io/pdfbox-carriage
Addon for the Konik library allows attaching and extracting XML content to PDFs with the help of PDFBox
Last synced: 14 Jan 2026
https://github.com/bompi88/pdfmerger
Node module that uses the Pdfbox library to merge PDF files into a single PDF file.
combine merge node nodejs npm-package pdf pdfbox pdfmerger stream
Last synced: 11 Apr 2025
https://github.com/upgundecha/se-powertools-recipes
Example project for ATA Seleniumsummit21 CP-SAT
faker pdfbox selenium selenium-webdriver shutterbug tesseract tesseract-ocr webdrivermanager wiremock
Last synced: 25 Aug 2025
https://github.com/esign-consulting/postdenuncia
Projeto de software para cidadĂŁos denunciarem problemas urbanos.
geocoding html-parser java jsoup maven pdfbox webscraping
Last synced: 26 Jan 2026
https://github.com/quarkiverse/quarkus-pdfbox
An open source Java tool for working with PDF documents
apache pdf pdfbox quarkus-extension
Last synced: 11 Oct 2025
https://github.com/jhund/pdfbox_text_extraction
Provides a Jruby wrapper for Apache PDFBox library to extract plain text from PDF documents.
extract jruby pdfbox plain-text
Last synced: 03 Oct 2025
https://github.com/lucasvbr/lecteurpdfdoubleaffichage
Application Java permettant de visionner un ou deux documents PDF en mĂŞme temps. Il sera aussi possible d'ouvrir deux fois le mĂŞme fichier.
java javaswing pdf-viewer pdfbox
Last synced: 17 Mar 2025
https://github.com/chqu1012/pdfeditorfx
A small javafx pdf editor
bellsoft-liberica eclipse emf javafx pdf pdfbox xcore xml
Last synced: 22 Apr 2026
https://github.com/apache/pdfbox-testfiles
Mirror of Apache PDFBox Testfiles
Last synced: 14 Apr 2026
https://github.com/seryiza/remplater
generated pdf templates for remarkable 2 from code
clojure pdf-template pdfbox remarkable
Last synced: 27 Jul 2025
https://github.com/localhostpib/youtubeapi
Youtube comments and metadata in a SQLite database using the Youtube/Google API.
apache-commons-csv csv google-oauth2 hibernate i18n jar java javafx jdbc-sqlite lombok materialfx maven openjfx pdfbox pixabay sqlite xhtml youtube-api
Last synced: 23 Feb 2026
https://github.com/jpleorx/pdf-scan
Experimenting with iTextPdf, PdfBox, Tesseract
itext itextpdf java ocr pdf pdf-files pdf-generation pdf-reader pdfbox pdfbox2 tess4j tesseract tesseract-ocr
Last synced: 25 Oct 2025
https://github.com/abhishek010397/mlops-aws-lambda
java lambda-functions pdfbox pdftoimage
Last synced: 09 Apr 2025
https://github.com/geo-mena/signsafe
🚦 Service that validates digital signatures in PDF documents.
bouncy-castle-library java open-source pdf pdfbox
Last synced: 29 Aug 2025
https://github.com/neophron88/pdf-book-converter
This desktop application allows the user convert a pdf-file to sqlite database
jdbc jdbc-driver-sqlite maven pdfbox swing-gui
Last synced: 25 Feb 2025
https://github.com/ekelhala/jmerge
Minimal tool for merging PDF files together, written in Java
file-management gui java maven pdf-document pdfbox swing
Last synced: 13 May 2026
https://github.com/fazliddinr/pdf-book-converter
This desktop application allows the user convert a pdf-file to sqlite database
jdbc jdbc-driver-sqlite maven pdfbox swing-gui
Last synced: 27 Apr 2026
https://github.com/amodsachintha/library-management-java-javafx
Library Management Software for Dickwella Pradeshiya Sabhawa
apache-derby java javafx-desktop-apps javafx-gui library-management-system pdfbox
Last synced: 18 May 2026
https://github.com/kolade-amire/text-extractor
kotlin regex scripts used in text preprocessing for qlish project
Last synced: 29 Oct 2025
https://github.com/wishall/pdf-extraction-api
A lightweight, production-ready PDF extraction API built with Spring Boot. Supports full-text extraction, per-page parsing, and metadata retrieval using Apache PDFBox. Designed for rapid deployment on cloud platforms.
api extraction pdf pdfbox spring-boot text-extraction
Last synced: 23 Nov 2025
https://github.com/mittalsoni00/filereader
Java PdfReader API is a Spring Boot-based application that extracts text from PDF files using the Apache PDFBox library. It provides a REST API to upload PDFs and retrieve their extracted text. This project simplifies text extraction for various applications like document processing and data analysis.
github-config java maven pdfbox postman spring spring-boot spring-web
Last synced: 06 Apr 2026
https://github.com/marioszocs/pdf-splitter
Split PDF files by size, by page, and extract email addresses
itextpdf java pdf pdfbox pdfextraction pdfsplitter
Last synced: 04 Jul 2025
https://github.com/marinanimanga/telocedo
TLoCdo is an app designed to provide an outlet for texts written by people who don't have the opportunity to publish them. Through this app, these writers can upload their texts for free and read others'.
Last synced: 13 May 2026
https://github.com/peasfultown/jebman
jebman - java ebooks manager
ebook-manager java java11 jdbc pdfbox sqlite3
Last synced: 29 Mar 2025
https://github.com/stefan-goldschmidt/pdf_merger_desktop
Desktop application for merging multiple PDF files into a single document
javafx-application javafx-desktop-apps macos pdf-generation pdfbox windows
Last synced: 16 Mar 2026
https://github.com/qble2/pdf-viewer-spring-fx-app
A JavaFX application that is launched and managed via Spring Boot. This application aims at easing the manipulatation of large PDF files (browse, split, filter, search and annotate).
css guava-event-bus h2-database hibernate java javafx jpa lombok maven openjdk pdfbox scene-builder spring-boot spring-data-jpa
Last synced: 31 Jan 2026
https://github.com/nilostolte/pdfbox
This project offers several versions of PDFBox source code that can be compiled with Eclipse. The complete version is a complete unmodified PDFBox with all packages normally not included in PDFBox source code. The other versions are modified versions offering more capabilities.
eclipse java pdf-converter pdf-debugger pdf-documents pdf-viewer pdfbox pdfbox-source
Last synced: 10 Feb 2026
https://github.com/mujahidk/pdfboxer
PDF generation service using Apache PDFBox and SpringBoot application.
Last synced: 17 Apr 2026
https://github.com/piyushwani004/imagetotextconvertor
You Can Convert any Image file and any PDF File into Text Document Using OCR (Optical Character Recognition) using below Program.
commons-io java-8 javafx ocr-recognition pdfbox
Last synced: 18 Apr 2026
https://github.com/shardulvs/pdfinspector-android
A DevTools-style PDF element inspector and editor for Android
android document-editor jetpack-compose kotlin material3 open-source pdf pdf-editor pdf-viewer pdfbox
Last synced: 04 Jun 2026
https://github.com/drpedapati/csharp-pdf-filler
Syncfusion-free C#/.NET CLI for inspecting, schema-exporting, filling, and flattening AcroForm PDFs.
acroform cli csharp dotnet form-filling pdf pdf-forms pdfbox
Last synced: 26 Apr 2026
https://github.com/jordy-swinnen/payslip-poc
A proof-of-concept Spring Boot application that uses Spring AI with OpenAI to extract structured data from payslip documents (PDF or image files).
java openai openai-api pdfbox spring spring-ai spring-ai-openai spring-boot
Last synced: 30 Apr 2026
https://github.com/sidmishraw/autobot
PDF parsing and extraction utility using Apache Tika
apache-tika data-extraction java pdf-parsing pdfbox
Last synced: 10 Jun 2026