Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with pdf
A curated list of projects in awesome lists tagged with pdf .
https://github.com/justjavac/free-programming-books-zh_cn
:books: 免费的计算机编程类中文书籍,欢迎投稿
android angular books free ios javascript kotlin pdf programming python react react-native swift vue
Last synced: 16 Dec 2024
https://github.com/justjavac/free-programming-books-zh_CN
:books: 免费的计算机编程类中文书籍,欢迎投稿
android angular books free ios javascript kotlin pdf programming python react react-native swift vue
Last synced: 26 Oct 2024
https://github.com/stirling-tools/stirling-pdf
#1 Locally hosted web application that allows you to perform various operations on PDF files
docker java pdf pdf-converter pdf-editor pdf-manipulation pdf-merger pdf-ocr pdf-tools pdf-web-apps pdfmerger
Last synced: 16 Dec 2024
https://github.com/paperless-ngx/paperless-ngx
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
angular archiving django dms document-management document-management-system machine-learning ocr optical-character-recognition pdf
Last synced: 16 Dec 2024
https://github.com/siyuan-note/siyuan
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
anki chatgpt electron evernote knowledge-base local-first markdown note-taking notebook notes-app notion obsidian ocr openai pdf pkm s3 self-hosted webdav
Last synced: 16 Dec 2024
https://github.com/forthespada/cs-books
🔥🔥超过1000本的计算机经典书籍、个人笔记资料以及本人在各平台发表文章中所涉及的资源等。书籍资源包括C/C++、Java、Python、Go语言、数据结构与算法、操作系统、后端架构、计算机系统知识、数据库、计算机网络、设计模式、前端、汇编以及校招社招各种面经~
algorithms c cpp cs-books database interview java javascript linux os pdf python redis sql
Last synced: 16 Dec 2024
https://github.com/Stirling-Tools/Stirling-PDF
locally hosted web application that allows you to perform various operations on PDF files
docker java pdf pdf-converter pdf-manipulation pdf-merger pdf-ocr pdf-tools pdf-web-apps pdfmerger
Last synced: 28 Oct 2024
https://github.com/forthespada/CS-Books
🔥🔥超过1000本的计算机经典书籍、个人笔记资料以及本人在各平台发表文章中所涉及的资源等。书籍资源包括C/C++、Java、Python、Go语言、数据结构与算法、操作系统、后端架构、计算机系统知识、数据库、计算机网络、设计模式、前端、汇编以及校招社招各种面经~
algorithms c cpp cs-books database interview java javascript linux os pdf python redis sql
Last synced: 05 Nov 2024
https://github.com/opendatalab/mineru
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
ai4science document-analysis extract-data layout-analysis ocr parser pdf pdf-converter pdf-extractor-llm pdf-extractor-pretrain pdf-extractor-rag pdf-parser python
Last synced: 16 Dec 2024
https://github.com/koreader/koreader
An ebook reader application supporting PDF, DjVu, EPUB, FB2 and many more formats, running on Cervantes, Kindle, Kobo, PocketBook and Android devices
cbz djvu djvu-reflow ebook ebook-reader eink epub ereader fb2 kindle kobo luajit opds pdf pdf-reflow pocketbook reader reflow remarkable-tablet ubuntu-touch
Last synced: 16 Dec 2024
https://github.com/salomonelli/best-resume-ever
:necktie: :briefcase: Build fast :rocket: and easy multiple beautiful resumes and create your best CV ever! Made with Vue and LESS.
cv javascript nodejs pdf resume vue
Last synced: 16 Dec 2024
https://github.com/ether/etherpad-lite
Etherpad: A modern really-real-time collaborative document editor.
collaboration collaborative collaborative-editing collaborative-framework collaborative-research collaborative-writing document documents docx etherpad libreoffice microsoft pdf pdf-generation rich-text-editor video-conference video-conferencing web-editor word
Last synced: 16 Dec 2024
https://github.com/mayooear/gpt4-pdf-chatbot-langchain
GPT4 & LangChain Chatbot for large PDF docs
gpt4 langchain nextjs openai pdf typescript
Last synced: 16 Dec 2024
https://github.com/ds4sd/docling
Get your documents ready for gen AI
ai convert document-parser document-parsing documents docx html markdown pdf pdf-converter pdf-to-json pdf-to-text pptx tables xlsx
Last synced: 21 Dec 2024
https://github.com/jbarlow83/OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
image-processing ocr pdf python tesseract
Last synced: 16 Nov 2024
https://github.com/ocrmypdf/ocrmypdf
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
image-processing ocr pdf python tesseract
Last synced: 16 Dec 2024
https://github.com/microsoft/markitdown
Python tool for converting files and office documents to Markdown.
autogen autogen-extension langchain markdown microsoft-office openai pdf
Last synced: 18 Dec 2024
https://github.com/sumatrapdfreader/sumatrapdf
SumatraPDF reader
c c-plus-plus pdf pdf-viewer win32
Last synced: 16 Dec 2024
https://github.com/janishar/mit-deep-learning-book-pdf
MIT Deep Learning Book in PDF format (complete and parts) by Ian Goodfellow, Yoshua Bengio and Aaron Courville
book chapter clear deep-learning deeplearning excercises good learning lecture-notes linear-algebra machine machine-learning mit neural-network neural-networks pdf print printable thinking
Last synced: 17 Dec 2024
https://github.com/ocrmypdf/OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
image-processing ocr pdf python tesseract
Last synced: 26 Oct 2024
https://github.com/questpdf/questpdf
QuestPDF is a modern open-source .NET library for PDF document generation. Offering comprehensive layout engine powered by concise and discoverable C# Fluent API. Easily generate PDF reports, invoices, exports, etc.
create creation csharp dotnet export generate html invoice nuget pdf report reporting tool
Last synced: 16 Dec 2024
https://github.com/QuestPDF/QuestPDF
QuestPDF is a modern open-source .NET library for PDF document generation. Offering comprehensive layout engine powered by concise and discoverable C# Fluent API. Easily generate PDF reports, invoices, exports, etc.
create creation csharp dotnet export generate html invoice nuget pdf report reporting tool
Last synced: 29 Oct 2024
https://github.com/h2oai/h2ogpt
Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
ai chatgpt embeddings generative gpt gpt4all llama2 llm mixtral pdf private privategpt vectorstore
Last synced: 16 Dec 2024
https://github.com/xournalpp/xournalpp
Xournal++ is a handwriting notetaking software with PDF annotation support. Written in C++ with GTK3, supporting Linux (e.g. Ubuntu, Debian, Arch, SUSE), macOS and Windows 10. Supports pen input from devices such as Wacom Tablets.
c-plus-plus crossplatform gtk3 notes notetaking pdf pdf-viewer pen
Last synced: 16 Dec 2024
https://github.com/hmemcpy/milewski-ctfp-pdf
Bartosz Milewski's 'Category Theory for Programmers' unofficial PDF and LaTeX source
category-theory cpp functional-programming haskell latex ocaml pdf scala
Last synced: 16 Dec 2024
https://github.com/kekingcn/kkfileview
Universal File Online Preview Project based on Spring-Boot
docx fileview fileviewer java kkfileview office office-view pdf word
Last synced: 16 Dec 2024
https://github.com/kekingcn/kkFileView
Universal File Online Preview Project based on Spring-Boot
docx fileview fileviewer java kkfileview office office-view pdf word
Last synced: 29 Oct 2024
https://github.com/libvips/libvips
A fast image processing library with low memory needs.
c cpp gif graphicsmagick heic image-processing imagemagick jpeg libvips nifti openexr openslide pdf pdfium png svg tiff webp
Last synced: 16 Dec 2024
https://github.com/unstructured-io/unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
data-pipelines deep-learning document-image-analysis document-image-processing document-parser document-parsing docx donut information-retrieval langchain llm machine-learning ml natural-language-processing nlp ocr pdf pdf-to-json pdf-to-text preprocessing
Last synced: 16 Dec 2024
https://github.com/wmjordan/pdfpatcher
PDF补丁丁——PDF工具箱,可以编辑书签、剪裁旋转页面、解除限制、提取或合并文档,探查文档结构,提取图片、转成图片等等
pdf pdf-converter pdf-document-processor pdf-generation
Last synced: 17 Dec 2024
https://github.com/wmjordan/PDFPatcher
PDF补丁丁——PDF工具箱,可以编辑书签、剪裁旋转页面、解除限制、提取或合并文档,探查文档结构,提取图片、转成图片等等
pdf pdf-converter pdf-document-processor pdf-generation
Last synced: 29 Oct 2024
https://github.com/Unstructured-IO/unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
data-pipelines deep-learning document-image-analysis document-image-processing document-parser document-parsing docx donut information-retrieval langchain llm machine-learning ml natural-language-processing nlp ocr pdf pdf-to-json pdf-to-text preprocessing
Last synced: 30 Oct 2024
https://github.com/wojtekmaj/react-pdf
Display PDFs in your React app as easily as if they were images.
Last synced: 16 Dec 2024
https://github.com/0voice/expert_readed_books
2021年最新总结,推荐工程师合适读本,计算机科学,软件技术,创业,思想类,数学类,人物传记书籍
books nginx pdf redis software-architecture software-design
Last synced: 17 Dec 2024
https://github.com/flxzt/rnote
Sketch and take handwritten notes.
drawing gtk gtk-rs gtk4 gtk4-rs hacktoberfest handwriting infinite-canvas notes notes-app pdf rust wacom-tablet
Last synced: 18 Dec 2024
https://github.com/documenso/documenso
The Open Source DocuSign Alternative.
digital-signature document-signing docusign-alternative e-signature esign esignature next-auth nextjs open-source pades-standard pdf pdf-sign pdf-signature postgresql prisma self-hosted signing typescript
Last synced: 16 Dec 2024
https://github.com/py-pdf/pypdf
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
help-wanted pdf pdf-documents pdf-manipulation pdf-parser pdf-parsing pypdf2 python
Last synced: 16 Dec 2024
https://github.com/gotenberg/gotenberg
A developer-friendly API for converting numerous document formats into PDF files, and more!
api chrome chromium convert-to-pdf docker docx-to-pdf excel exiftool html-to-pdf libreoffice openoffice pdf pdf-converter pdftk puppeteer qpdf screenshots unoconv wkhtmltopdf word
Last synced: 16 Dec 2024
https://github.com/windingwind/zotero-pdf-translate
Translate PDF, EPub, webpage, metadata, annotations, notes to the target language. Support 20+ translate services.
pdf plugin translate translation zotero zotero-plugin zotero7
Last synced: 17 Dec 2024
https://github.com/mstamy2/PyPDF2
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
help-wanted pdf pdf-documents pdf-manipulation pdf-parser pdf-parsing pypdf2 python
Last synced: 29 Oct 2024
https://github.com/kozea/weasyprint
The awesome document factory
converter css html pdf python weasyprint
Last synced: 16 Dec 2024
https://github.com/Kozea/WeasyPrint
The awesome document factory
converter css html pdf python weasyprint
Last synced: 30 Oct 2024
https://github.com/ahrm/sioyek
Sioyek is a PDF viewer with a focus on textbooks and research papers
Last synced: 17 Dec 2024
https://github.com/alvarcarto/url-to-pdf-api
Web page PDF/PNG rendering done right. Self-hosted service for rendering receipts, invoices, or any content.
chrome headless headless-chrome heroku heroku-button html invoice pdf puppeteer receipt
Last synced: 18 Dec 2024
https://github.com/hopding/pdf-lib
Create and modify PDF documents in any JavaScript environment
copy copying create creation document documents edit editing generation javascript lib library modification modify pdf typescript umd
Last synced: 16 Dec 2024
https://github.com/Hopding/pdf-lib
Create and modify PDF documents in any JavaScript environment
copy copying create creation document documents edit editing generation javascript lib library modification modify pdf typescript umd
Last synced: 02 Nov 2024
https://github.com/jsvine/pdfplumber
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
pdf pdf-parsing table-extraction
Last synced: 16 Dec 2024
https://github.com/opendatalab/MinerU
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
ai4science document-analysis extract-data layout-analysis ocr parser pdf pdf-converter pdf-extractor-llm pdf-extractor-pretrain pdf-extractor-rag pdf-parser python
Last synced: 29 Oct 2024
https://github.com/docusealco/docuseal
Open source DocuSign alternative. Create, fill, and sign digital documents ✍️
daisyui document-signing documents e-signature github-catalyst hotwired-turbo legaltech open-source pdf pdf-sign pdf-signature ruby-on-rails self-hosted tailwindcss vuejs webpack
Last synced: 17 Dec 2024
https://github.com/pdfcpu/pdfcpu
A PDF processor written in Go.
go golang golang-library pdf pdf-files pdf-lib pdf-processor pdflib processor
Last synced: 16 Dec 2024
https://github.com/kareadita/kavita
Kavita is a fast, feature rich, cross platform reading server. Built with the goal of being a full solution for all your reading needs. Setup your own server and share your reading collection with your friends and family.
cbz comicinfo comics comics-reader cross-platform csharp epub epub-reader free linux manga media-server metadata opds opds-feed pdf self-hosted ubooquity windows
Last synced: 17 Dec 2024
https://github.com/wandmalfarbe/pandoc-latex-template
A pandoc LaTeX template to convert markdown files to PDF or LaTeX.
eisvogel koma-script latex latex-template markdown markdown-to-pdf pandoc pandoc-template pandoc-templates pdf pdf-generation tex
Last synced: 17 Dec 2024
https://github.com/Wandmalfarbe/pandoc-latex-template
A pandoc LaTeX template to convert markdown files to PDF or LaTeX.
eisvogel koma-script latex latex-template markdown markdown-to-pdf pandoc pandoc-template pandoc-templates pdf pdf-generation tex
Last synced: 01 Nov 2024
https://github.com/pdfminer/pdfminer.six
Community maintained fork of pdfminer - we fathom PDF
Last synced: 16 Dec 2024
https://github.com/axa-group/parsr
Transforms PDF, Documents and Images into Enriched Structured Data
data document extraction hacktoberfest images nlp ocr parsr pdf python typescript
Last synced: 17 Dec 2024
https://github.com/axa-group/Parsr
Transforms PDF, Documents and Images into Enriched Structured Data
data document extraction hacktoberfest images nlp ocr parsr pdf python typescript
Last synced: 25 Oct 2024
https://github.com/KAYOKG/BibliotecaDev
📚 Biblioteca de livros essenciais da área da programação. (Confira o meu novo projeto `SendScriptWhatsapp`)
biblioteca-do-desenvolvedor english leitura livro livros-desenvolvedor pdf portuguese
Last synced: 07 Nov 2024
https://github.com/kayokg/bibliotecadev
📚 Biblioteca de livros essenciais da área da programação. (Confira o meu novo projeto `SendScriptWhatsapp`)
biblioteca-do-desenvolvedor english leitura livro livros-desenvolvedor pdf portuguese
Last synced: 03 Dec 2024
https://github.com/mfts/papermark
Papermark is the open-source DocSend alternative with built-in analytics and custom domains.
dataroom hacktoberfest next-auth nextjs open-source pdf postgresql prisma tailwindcss typescript zod
Last synced: 17 Dec 2024
https://github.com/pymupdf/PyMuPDF
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
data-science epub extract-data font mupdf ocr pdf pdf-documents pymupdf python table-extraction tesseract text-processing text-shaping xps
Last synced: 06 Nov 2024
https://github.com/pymupdf/pymupdf
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
data-science epub extract-data font mupdf ocr pdf pdf-documents pymupdf python table-extraction tesseract text-processing text-shaping xps
Last synced: 16 Dec 2024
https://github.com/forthespada/InterviewGuide
🔥🔥「InterviewGuide」是阿秀从校园->职场多年计算机自学过程的记录以及学弟学妹们计算机校招&秋招经验总结文章的汇总,包括但不限于C/C++ 、Golang、JavaScript、Vue、操作系统、数据结构、计算机网络、MySQL、Redis等学习总结,坚持学习,持续成长!
code cpp data-structures-and-algorithms guide interview interview-practice interview-preparation interview-questions java mysql os pdf questions-and-answers redis system
Last synced: 04 Nov 2024
https://github.com/forthespada/interviewguide
🔥🔥「InterviewGuide」是阿秀从校园->职场多年计算机自学过程的记录以及学弟学妹们计算机校招&秋招经验总结文章的汇总,包括但不限于C/C++ 、Golang、JavaScript、Vue、操作系统、数据结构、计算机网络、MySQL、Redis等学习总结,坚持学习,持续成长!
code cpp data-structures-and-algorithms guide interview interview-practice interview-preparation interview-questions java mysql os pdf questions-and-answers redis system
Last synced: 17 Dec 2024
https://github.com/laochiangx/common.utility
Various helper class
chm common cookiehelper excelhelpers help helper httphelper javascript jsonhelper mongodbhelper net npoi page pdf regexhelper sessionhelper sqlhelper tool utility xmlhelper
Last synced: 19 Dec 2024
https://github.com/laochiangx/Common.Utility
Various helper class
chm common cookiehelper excelhelpers help helper httphelper javascript jsonhelper mongodbhelper net npoi page pdf regexhelper sessionhelper sqlhelper tool utility xmlhelper
Last synced: 07 Nov 2024
https://github.com/Kareadita/Kavita
Kavita is a fast, feature rich, cross platform reading server. Built with the goal of being a full solution for all your reading needs. Setup your own server and share your reading collection with your friends and family.
cbz comicinfo comics comics-reader cross-platform csharp epub epub-reader free linux manga media-server metadata opds opds-feed pdf self-hosted ubooquity windows
Last synced: 25 Oct 2024
https://github.com/tangtangcoding/C-C-
程序员相关电子书资料免费分享,欢迎关注个人微信公众号:编程与实战
algorithms c computer-science cpp golang java linux mysql pdf python stl
Last synced: 05 Nov 2024
https://github.com/tangtangcoding/c-c-
程序员相关电子书资料免费分享,欢迎关注个人微信公众号:编程与实战
algorithms c computer-science cpp golang java linux mysql pdf python stl
Last synced: 19 Dec 2024
https://github.com/danburzo/percollate
A command-line tool to turn web pages into readable PDF, EPUB, HTML, or Markdown docs.
cli epub html markdown pdf puppeteer readability
Last synced: 17 Dec 2024
https://github.com/501351981/vue-office
支持word(.docx)、excel(.xlsx,.xls)、pdf、pptx等各类型office文件预览的vue组件集合,提供一站式office文件预览方案,支持vue2和3,也支持React等非Vue框架。Web-based pdf, excel, word, pptx preview library
docx docx-preview excel pdf pdf-preview pdf-viewer vue xlsx xlsx-preview
Last synced: 16 Dec 2024
https://github.com/hcfyapp/crx-selection-translate
一站式划词 / 截图 / 网页全文 / 音视频 AI 翻译扩展。
chrome chrome-extension crx firefox javascript pdf translation
Last synced: 20 Dec 2024
https://github.com/vslavik/diff-pdf
A simple tool for visually comparing two PDF files
Last synced: 17 Dec 2024
https://github.com/hufe921/canvas-editor
rich text editor by canvas/svg
browser canvas canvas-editor control date-picker editor emr latex pdf pdf-generation rich-text svg typescript vite word wysiwyg
Last synced: 17 Dec 2024
https://vslavik.github.io/diff-pdf/
A simple tool for visually comparing two PDF files
Last synced: 12 Nov 2024
https://github.com/pdf2htmlEX/pdf2htmlEX
Convert PDF to HTML without losing text or format.
html pdf pdf-document-processor pdf-viewer
Last synced: 28 Oct 2024
https://github.com/atlanhq/camelot
Camelot: PDF Table Extraction for Humans
Last synced: 29 Oct 2024
https://github.com/librepdf/openpdf
OpenPDF is a free Java library for creating and editing PDF files, with a LGPL and MPL open source license. OpenPDF is based on a fork of iText. We welcome contributions from other developers. Please feel free to submit pull-requests and bugreports to this GitHub repository.
hacktoberfest itext java openpdf pdf pdf-generation
Last synced: 16 Dec 2024
https://github.com/marcbachmann/node-html-pdf
This repo isn't maintained anymore as phantomjs got dreprecated a long time ago. Please migrate to headless chrome/puppeteer.
html pdf pdf-converter phantomjs
Last synced: 16 Dec 2024
https://github.com/kermitt2/grobid
A machine learning software for extracting information from scholarly documents
bibliographical-references crf deep-learning fulltext hamburger-to-cow machine-learning metadata pdf rnn scientific-articles transformers
Last synced: 17 Dec 2024
https://github.com/torakiki/pdfsam
PDFsam, a desktop application to split, merge, mix, rotate PDF files and extract pages
combine extract java javafx merge merge-pdf merger pdf pdf-combiner pdf-extractor pdf-manipulation pdf-merge pdf-mix pdf-rotate pdf-split rotate split split-pdf splitter
Last synced: 18 Dec 2024
https://github.com/qpdf/qpdf
qpdf: A content-preserving PDF document transformer
Last synced: 17 Dec 2024
https://github.com/jorisschellekens/borb
borb is a library for reading, creating and manipulating PDF files in python.
library pdf pdf-conversion pdf-converter pdf-generation pdf-library python python3 sdk typesetting
Last synced: 17 Dec 2024