An open API service indexing awesome lists of open source software.

https://github.com/filippostanghellini/docfinder

DocFinder is a local-first indexing and searching documents using semantic embeddings stored in SQLite. Everything runs on your machine, no external services required.
https://github.com/filippostanghellini/docfinder

documentsearch local-first local-llm offline pdf privacy rag semantic-search sentence-transformers

Last synced: 2 months ago
JSON representation

DocFinder is a local-first indexing and searching documents using semantic embeddings stored in SQLite. Everything runs on your machine, no external services required.

Awesome Lists containing this project

README

          

# DocFinder

[![CI](https://img.shields.io/github/actions/workflow/status/filippostanghellini/DocFinder/ci.yml?branch=main&label=CI&logo=github)](https://github.com/filippostanghellini/DocFinder/actions/workflows/ci.yml)
[![CodeQL](https://img.shields.io/github/actions/workflow/status/filippostanghellini/DocFinder/codeql.yml?branch=main&label=CodeQL&logo=github)](https://github.com/filippostanghellini/DocFinder/actions/workflows/codeql.yml)
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](LICENSE)
[![Python](https://img.shields.io/badge/python-3.10%2B-blue?logo=python&logoColor=white)](https://www.python.org/downloads/)
[![Release](https://img.shields.io/github/v/release/filippostanghellini/DocFinder?logo=github)](https://github.com/filippostanghellini/DocFinder/releases)
[![Downloads](https://img.shields.io/github/downloads/filippostanghellini/DocFinder/total?logo=github)](https://github.com/filippostanghellini/DocFinder/releases)


DocFinder Logo


Local-first semantic search for your documents.

Supports PDF, Word (.docx), Markdown, and plain text files.

Everything runs on your machine — no cloud, no accounts, complete privacy.


DocFinder Demo

## Features

- **Semantic search** — find documents by meaning, not just keywords (PDF, DOCX, Markdown, TXT)
- **AI chat** — ask questions about any document and get precise answers, powered by local Qwen models (automatically selects the best model for your hardware)
- **100% local** — your files never leave your machine
- **GPU accelerated** — auto-detects Apple Silicon (Metal), NVIDIA (CUDA), AMD (ROCm)
- **Cross-platform** — native apps for macOS, Windows, and Linux
- **Global shortcut** (Experimental) — bring DocFinder to front from anywhere with a configurable hotkey


DocFinder Chat

## Download

| Platform | Installer |
|----------|-----------|
| **macOS** | [DocFinder-macOS.dmg](https://github.com/filippostanghellini/DocFinder/releases/latest) |
| **Windows** | [DocFinder-Windows-Setup.exe](https://github.com/filippostanghellini/DocFinder/releases/latest) |
| **Linux** | [DocFinder-Linux-x86_64.AppImage](https://github.com/filippostanghellini/DocFinder/releases/latest) |

**macOS** — open the DMG, drag DocFinder to Applications, then right-click → **Open** on first launch (Gatekeeper warning — normal for unsigned open-source apps).

**Windows** — run the installer; if SmartScreen appears choose **More info → Run anyway**.

**Linux**
```bash
chmod +x DocFinder-Linux-x86_64.AppImage && ./DocFinder-Linux-x86_64.AppImage
```

## Run from Source

Requires Python 3.10+ and `make`.

```bash
git clone https://github.com/filippostanghellini/DocFinder.git
cd DocFinder
make setup # create .venv and install all dependencies
make run # desktop GUI
make run-web # web interface at http://127.0.0.1:8000
```

### Runtime acceleration (auto)

DocFinder automatically selects the best available runtime on your machine:

- NVIDIA: ONNX CUDA provider when available, otherwise PyTorch CUDA
- AMD: ONNX ROCm provider when available, otherwise PyTorch fallback
- Apple Silicon: optimized ONNX path
- CPU-only hosts: ONNX or PyTorch CPU fallback

Indexing uses a balanced parallel parser strategy by default, selected automatically based on your
machine resources.

## Contributing

Contributions are welcome, feel free to open an issue or submit a pull request.

## License

Licensed under the **GNU Affero General Public License v3.0 (AGPL-3.0)**.

> DocFinder was originally released under the MIT License. Starting from version 1.1.1 the license was changed to AGPL-3.0 to comply with the [PyMuPDF](https://pymupdf.readthedocs.io/) licensing requirements, as PyMuPDF itself is AGPL-3.0 licensed.