Projects in Awesome Lists by pd3f
A curated list of projects in awesome lists by pd3f .
https://github.com/pd3f/pd3f
🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based
extract-text language-model machine-learning ocr parsr pd3f pdf pdf-to-text pipeline python text-extraction
Last synced: 08 Apr 2026
https://github.com/pd3f/dehyphen
📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF
dehyphenation flair flair-embeddings german hyphen hyphens nlp pd3f pdf python
Last synced: 08 Apr 2026
https://github.com/pd3f/pd3f-core
📑 Python Package to reconstruct the original continuous text from PDFs with language models
dehyphenation language-model machine-learning pd3f pdf text-extraction
Last synced: 08 Apr 2026