https://github.com/filippostanghellini/docfinder
DocFinder is a local-first indexing and searching documents using semantic embeddings stored in SQLite. Everything runs on your machine, no external services required.
https://github.com/filippostanghellini/docfinder
documentsearch local-first local-llm offline pdf privacy rag semantic-search sentence-transformers
Last synced: 2 months ago
JSON representation
DocFinder is a local-first indexing and searching documents using semantic embeddings stored in SQLite. Everything runs on your machine, no external services required.
- Host: GitHub
- URL: https://github.com/filippostanghellini/docfinder
- Owner: filippostanghellini
- License: other
- Created: 2025-11-03T22:06:24.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2026-04-25T09:15:46.000Z (2 months ago)
- Last Synced: 2026-04-25T10:24:23.166Z (2 months ago)
- Topics: documentsearch, local-first, local-llm, offline, pdf, privacy, rag, semantic-search, sentence-transformers
- Language: Python
- Homepage:
- Size: 5.57 MB
- Stars: 21
- Watchers: 1
- Forks: 3
- Open Issues: 16
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# DocFinder
[](https://github.com/filippostanghellini/DocFinder/actions/workflows/ci.yml)
[](https://github.com/filippostanghellini/DocFinder/actions/workflows/codeql.yml)
[](LICENSE)
[](https://www.python.org/downloads/)
[](https://github.com/filippostanghellini/DocFinder/releases)
[](https://github.com/filippostanghellini/DocFinder/releases)
Local-first semantic search for your documents.
Supports PDF, Word (.docx), Markdown, and plain text files.
Everything runs on your machine — no cloud, no accounts, complete privacy.
## Features
- **Semantic search** — find documents by meaning, not just keywords (PDF, DOCX, Markdown, TXT)
- **AI chat** — ask questions about any document and get precise answers, powered by local Qwen models (automatically selects the best model for your hardware)
- **100% local** — your files never leave your machine
- **GPU accelerated** — auto-detects Apple Silicon (Metal), NVIDIA (CUDA), AMD (ROCm)
- **Cross-platform** — native apps for macOS, Windows, and Linux
- **Global shortcut** (Experimental) — bring DocFinder to front from anywhere with a configurable hotkey
## Download
| Platform | Installer |
|----------|-----------|
| **macOS** | [DocFinder-macOS.dmg](https://github.com/filippostanghellini/DocFinder/releases/latest) |
| **Windows** | [DocFinder-Windows-Setup.exe](https://github.com/filippostanghellini/DocFinder/releases/latest) |
| **Linux** | [DocFinder-Linux-x86_64.AppImage](https://github.com/filippostanghellini/DocFinder/releases/latest) |
**macOS** — open the DMG, drag DocFinder to Applications, then right-click → **Open** on first launch (Gatekeeper warning — normal for unsigned open-source apps).
**Windows** — run the installer; if SmartScreen appears choose **More info → Run anyway**.
**Linux**
```bash
chmod +x DocFinder-Linux-x86_64.AppImage && ./DocFinder-Linux-x86_64.AppImage
```
## Run from Source
Requires Python 3.10+ and `make`.
```bash
git clone https://github.com/filippostanghellini/DocFinder.git
cd DocFinder
make setup # create .venv and install all dependencies
make run # desktop GUI
make run-web # web interface at http://127.0.0.1:8000
```
### Runtime acceleration (auto)
DocFinder automatically selects the best available runtime on your machine:
- NVIDIA: ONNX CUDA provider when available, otherwise PyTorch CUDA
- AMD: ONNX ROCm provider when available, otherwise PyTorch fallback
- Apple Silicon: optimized ONNX path
- CPU-only hosts: ONNX or PyTorch CPU fallback
Indexing uses a balanced parallel parser strategy by default, selected automatically based on your
machine resources.
## Contributing
Contributions are welcome, feel free to open an issue or submit a pull request.
## License
Licensed under the **GNU Affero General Public License v3.0 (AGPL-3.0)**.
> DocFinder was originally released under the MIT License. Starting from version 1.1.1 the license was changed to AGPL-3.0 to comply with the [PyMuPDF](https://pymupdf.readthedocs.io/) licensing requirements, as PyMuPDF itself is AGPL-3.0 licensed.