An open API service indexing awesome lists of open source software.

Projects in Awesome Lists by pd3f

A curated list of projects in awesome lists by pd3f .

https://github.com/pd3f/pd3f

🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based

extract-text language-model machine-learning ocr parsr pd3f pdf pdf-to-text pipeline python text-extraction

Last synced: 08 Apr 2026

https://github.com/pd3f/dehyphen

📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF

dehyphenation flair flair-embeddings german hyphen hyphens nlp pd3f pdf python

Last synced: 08 Apr 2026

https://github.com/pd3f/pd3f-core

📑 Python Package to reconstruct the original continuous text from PDFs with language models

dehyphenation language-model machine-learning pd3f pdf text-extraction

Last synced: 08 Apr 2026