{"id":49245190,"url":"https://github.com/robeertm/docusort","last_synced_at":"2026-05-09T19:14:48.859Z","repository":{"id":353635644,"uuid":"1220230052","full_name":"robeertm/DocuSort","owner":"robeertm","description":"automatic docomentscanner and analyzer","archived":false,"fork":false,"pushed_at":"2026-04-24T21:30:36.000Z","size":171,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-24T21:37:55.367Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/robeertm.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-24T17:25:32.000Z","updated_at":"2026-04-24T21:30:38.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/robeertm/DocuSort","commit_stats":null,"previous_names":["robeertm/docusort"],"tags_count":24,"template":false,"template_full_name":null,"purl":"pkg:github/robeertm/DocuSort","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/robeertm%2FDocuSort","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/robeertm%2FDocuSort/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/robeertm%2FDocuSort/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/robeertm%2FDocuSort/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/robeertm","download_url":"https://codeload.github.com/robeertm/DocuSort/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/robeertm%2FDocuSort/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32478162,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-30T13:12:12.517Z","status":"ssl_error","status_checked_at":"2026-04-30T13:12:06.837Z","response_time":57,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-04-24T21:05:43.767Z","updated_at":"2026-04-30T22:00:49.871Z","avatar_url":"https://github.com/robeertm.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DocuSort\n\n**AI-powered, self-hosted document organizer with OCR, receipt scanning, and analytics.**\nUpload a scan from your desktop or phone and — a few seconds later — it is\nrenamed, dated, filed into the right category, and browsable in a clean,\nmobile-friendly interface. Receipts are auto-detected and broken down into\nstructured line items so you can see where your money actually goes.\n\nBuilt for a Synology NAS in Docker, but runs anywhere Docker (or just\nplain Python) runs. Pick your AI: Anthropic Claude, OpenAI GPT, Google\nGemini, or run a local model via Ollama — your documents never have to\nleave your network.\n\n![Dashboard](docs/screenshots/01-dashboard.png)\n\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003ctd width=\"50%\"\u003e\u003ca href=\"docs/screenshots/02-library.png\"\u003e\u003cimg src=\"docs/screenshots/02-library.png\" alt=\"Library with year tree, full-text search, tag filters\" /\u003e\u003c/a\u003e\u003c/td\u003e\n    \u003ctd width=\"50%\"\u003e\u003ca href=\"docs/screenshots/03-analytics.png\"\u003e\u003cimg src=\"docs/screenshots/03-analytics.png\" alt=\"Analytics: spend per month, by shop type, by item category\" /\u003e\u003c/a\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003csub\u003e\u003cb\u003eLibrary\u003c/b\u003e — year tree, full-text search across OCR text, tag filters, ZIP export of any selection.\u003c/sub\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003csub\u003e\u003cb\u003eAnalytics\u003c/b\u003e — receipts auto-extracted into line items: total spent, by shop type, by item category, top items, monthly trend.\u003c/sub\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003ca href=\"docs/screenshots/04-settings.png\"\u003e\u003cimg src=\"docs/screenshots/04-settings.png\" alt=\"Settings: AI provider switch + cloud sync target chooser\" /\u003e\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"docs/screenshots/05-backup.png\"\u003e\u003cimg src=\"docs/screenshots/05-backup.png\" alt=\"Local-folder backup with built-in path picker\" /\u003e\u003c/a\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003csub\u003e\u003cb\u003eSettings\u003c/b\u003e — switch AI providers (Anthropic / OpenAI / Gemini / Ollama) without losing keys, configure backup target.\u003c/sub\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003csub\u003e\u003cb\u003eBackup\u003c/b\u003e — local rsync to a USB stick / NAS share (zero auth) or cloud via rclone with a headless OAuth-token-paste flow.\u003c/sub\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n- **Web UI** on a port you pick (default 8080, configurable in the wizard\n  and in `/settings`) — dashboard, library browser with full-text search,\n  per-document detail + PDF preview, mobile upload with camera capture\n- **First-run setup wizard** at `/setup` — pick language, AI provider, paste\n  token, optionally configure backup. Always-reachable settings page at `/settings`\n- **Multi-provider AI**: Anthropic Claude · OpenAI GPT · Google Gemini · any\n  OpenAI-compatible endpoint (Ollama, Groq, xAI, Mistral, OpenRouter …) ·\n  or the **Local AI Bridge** (run inference on a Mac / Linux / Windows\n  box of your choice — see below)\n- **Subcategories + free-form tags** — files land at\n  `Library/YYYY/Category/Subcategory/` and carry up to 8 lowercase labels\n- **Receipt scanner + analytics** — Kassenzettel are recognised\n  automatically. A second-pass LLM extracts shop name + type, payment\n  method, total, and per-line items with prices and item categories.\n  Browse aggregated spend per month, by shop type and item category,\n  search line items, see your most-bought items at `/analytics`.\n- **OCR** for scanned PDFs and images (Tesseract `deu+eng`)\n- **Backup** to a local folder (rsync, no setup) or to the cloud via rclone\n  (Drive · Dropbox · OneDrive · S3 · WebDAV · SFTP). Headless-friendly:\n  no browser needed on the host.\n- **Cost tracking** per document + aggregated (tokens in/out, USD and EUR preview),\n  with prompt-caching factored in for Anthropic and OpenAI\n- **Low-confidence review folder** instead of wrong guesses, full metadata\n  editing on the document detail page\n- **Trash + restore + permanent purge**, ZIP export of any filtered selection\n- **Safety copy** of every original kept in `_Processed/`\n- **i18n**: German · English · French · Italian · Spanish\n\n## File naming\n\nEvery filed document follows the same pattern:\n\n```\nYYYY-MM-DD_Category_Sender_Subject.pdf\n```\n\nExamples:\n\n```\n2026-02-14_Rechnungen_Vodafone_Mobilfunk-Februar.pdf\n2026-01-03_Gesundheit_Hausarzt-Dr-Mueller_Blutbild.pdf\n2026-03-20_Steuer_Finanzamt-Dresden_Bescheid-2024.pdf\n```\n\nThe template is configurable in `config/config.yaml`.\n\n## Folder layout\n\n```\n/data/\n├── inbox/                    ← drop scans here\n└── library/\n    ├── 2026/\n    │   ├── Rechnungen/\n    │   ├── Vertraege/\n    │   ├── Behoerde/\n    │   ├── Gesundheit/\n    │   ├── Gehalt/\n    │   ├── Steuer/\n    │   ├── Haus/\n    │   ├── Versicherung/\n    │   ├── Bank/\n    │   └── Sonstiges/\n    ├── _Review/              ← uncertain docs land here for manual sorting\n    └── _Processed/           ← copy of every original file\n```\n\n## Requirements\n\n- Docker and docker-compose (Synology: install \"Container Manager\" from Package\n  Center, DSM 7.2+) — or run directly on Linux/macOS via the launchers below\n- A folder on your NAS where scans arrive (e.g. `/volume1/Scan`)\n- A folder that will become your library (e.g. `/volume1/Dokumente`)\n- **An API key for one** of:\n  - Anthropic Claude (`ANTHROPIC_API_KEY`, recommended — cheapest with prompt caching)\n  - OpenAI GPT (`OPENAI_API_KEY`)\n  - Google Gemini (`GEMINI_API_KEY`)\n  - …or no key at all if you run a local model via Ollama. Hardware\n    suggestion: 8 GB RAM for an 8B model, 16 GB for a 13B, GPU recommended.\n\n## Quick start on Synology\n\n1. **Copy the project** to your NAS, e.g. to\n   `/volume1/docker/docusort/`. Via File Station, SFTP, or:\n   ```bash\n   scp -r docusort admin@synology:/volume1/docker/\n   ```\n\n2. **Adjust `docker-compose.yml`** if your paths differ. Defaults:\n   ```yaml\n   volumes:\n     - /volume1/Scan:/data/inbox\n     - /volume1/Dokumente:/data/library\n     - /volume1/docker/docusort/config:/app/config\n     - /volume1/docker/docusort/logs:/app/logs\n   ```\n\n3. **Build and start**:\n   ```bash\n   sudo docker compose up -d --build\n   ```\n\n4. **Check the logs**:\n   ```bash\n   sudo docker logs -f docusort\n   ```\n\n5. **Open the UI** at `http://\u003cnas-ip\u003e:\u003cport\u003e` (default `8080`; you can\n   change it during setup or later in `/settings`). On first start the\n   **setup wizard** at `/setup` walks you through language, AI provider +\n   token, an optional port + host, and an optional backup target. The\n   wizard writes `config/secrets.yaml` (mode 0600, gitignored) and updates\n   `config/config.yaml`. After the final step the service restarts itself\n   and lands you on the dashboard.\n\n   Dropping a PDF into `/volume1/Scan` then works — it appears correctly\n   named under `/volume1/Dokumente/2026/…/`.\n\n   You can revisit any of those choices later under **Einstellungen** (the\n   cog in the header) — provider, model, API keys, paths, sync target.\n\n\u003e **Legacy env-var setup also still works** — if you set\n\u003e `ANTHROPIC_API_KEY` (or `OPENAI_API_KEY` / `GEMINI_API_KEY`) in your\n\u003e `.env`, DocuSort picks it up and skips the wizard's token step.\n\n## Quick start locally (Mac / Linux / Windows)\n\nThree launcher scripts live in the project root — pick the one that matches\nyour OS:\n\n- **macOS**: double-click `start.command` (or `./start.sh` from a Terminal)\n- **Linux**: `./start.sh`\n- **Windows**: double-click `start.bat`\n\nEach launcher creates a `.venv` on first run, keeps Python deps in sync,\nwarns if tesseract / ocrmypdf are missing, and then boots the app on the\nconfigured port (default `http://localhost:8080`). Open the URL — the\n**setup wizard** at `/setup` collects everything else, including the\nport if you want a different one.\n\nIf you'd rather pre-seed the API key as an env var instead of typing it\nin the wizard, drop a `.env` next to the launcher with one of:\n\n```\nANTHROPIC_API_KEY=sk-ant-...\nOPENAI_API_KEY=sk-...\nGEMINI_API_KEY=AIza...\n```\n\nOCR needs system-level Tesseract and ocrmypdf installed\n(`brew install tesseract tesseract-lang ocrmypdf` on macOS,\n`sudo apt install tesseract-ocr tesseract-ocr-deu ocrmypdf` on Debian/Ubuntu).\n\n## HTTPS (required for background uploads)\n\nBrowsers only run service workers in a secure context — over plain HTTP\nuploads work but run in the foreground (keep the tab open). Flipping to\nHTTPS buys you true background-uploads that survive a tab close.\n\nOn a Tailscale-attached host, one script does everything:\n\n```bash\n./scripts/setup-tailscale-https.sh\n```\n\nIt grabs a Let's Encrypt cert via `tailscale cert`, installs a weekly\nsystemd timer that renews it, and updates `config/config.yaml` with the\ncert paths. After `sudo systemctl restart docusort` the UI lives at\n`https://\u003chost\u003e.\u003ctailnet\u003e.ts.net:9876`.\n\nTo do it by hand, set these under `web:` in `config/config.yaml`:\n\n```yaml\nweb:\n  ssl_cert: \"/etc/docusort/certs/yourhost.ts.net.crt\"\n  ssl_key:  \"/etc/docusort/certs/yourhost.ts.net.key\"\n```\n\nAny PEM cert/key pair works (Caddy, certbot, self-signed). Uvicorn picks\nthem up on next start and serves TLS on the configured port.\n\n## Updates\n\nDocuSort ships with a built-in updater that pulls the newest release\nstraight from GitHub:\n\n- **Web UI**: a banner appears on every page when a newer version is\n  available — one click installs it.\n- **CLI**: `python -m docusort --check-update` and\n  `python -m docusort --update`.\n\nOn systemd hosts, enable the one-click restart by installing the scoped\nsudoers rule once:\n\n```bash\n./scripts/install-sudoers-rule.sh\n```\n\nThe rule grants `NOPASSWD` only for `systemctl restart docusort`.\n\n## Local AI: keep documents on your network\n\nDocuSort can do every classification and bank-statement extraction\nlocally — no token, no cloud round-trip, no per-document cost. Three\nways to set it up depending on where you want the model to actually\nrun.\n\n### A. DocuSort and Ollama on the same machine\n\nThe simplest path. If you start DocuSort directly with\n`./start.command` / `start.sh` / `start.bat`, install Ollama on the\nsame machine and DocuSort can talk to it directly:\n\n```bash\n# macOS\nbrew install ollama \u0026\u0026 brew services start ollama\nollama pull qwen2.5:7b-instruct\n\n# Linux\ncurl -fsSL https://ollama.com/install.sh | sh\nollama pull qwen2.5:7b-instruct\n\n# Windows\nwinget install Ollama.Ollama\nollama pull qwen2.5:7b-instruct\n```\n\nThen open `/settings` in DocuSort. The page detects the local\nOllama and shows a **Local Ollama on this machine** card with a\nmodel dropdown — pick a model, click **Use this on this machine**,\nrestart, done. Provider is set to `openai_compat`, base URL is\n`http://127.0.0.1:11434/v1`.\n\n### B. DocuSort on a NAS / VM, model on a different machine\n\nUse the **Local AI Bridge** when DocuSort itself runs somewhere\nthat cannot host a 7B+ model (Synology DS218 with 2 GB RAM, a tiny\nVM, a Raspberry Pi). The bridge is a Python script that runs on\nthe machine you *do* want to do inference on (Mac / Linux box /\nWindows desktop) and connects outbound to DocuSort over a\nWebSocket. No port forwarding, no firewall changes — anything that\ncan open the DocuSort URL in a browser can run the bridge.\n\n1. In DocuSort, switch the AI provider to **Local AI Bridge** in\n   `/settings`.\n2. Scroll to the **Local AI Bridge** card and download the\n   launcher for your OS — there are three buttons: **macOS**\n   (`.command`), **Windows** (`.bat`), **Linux** (`.sh`). The\n   launcher already contains the server URL and a shared-secret\n   token.\n3. Double-click the file you just downloaded. macOS: first launch\n   may show \"from an unidentified developer\" — right-click → Open.\n   Windows: SmartScreen may say \"Windows protected your PC\" —\n   click *More info* → *Run anyway*.\n4. The launcher auto-installs Ollama (Homebrew on macOS, the\n   official installer on Linux, winget on Windows), starts\n   `ollama serve`, pulls the requested model the first time, and\n   stays connected until you press Ctrl-C.\n5. The Settings card flips to a green **connected** badge with the\n   bridge host's name, OS, and model. Hit **Test** to round-trip a\n   prompt through the bridge and confirm it answers.\n\nThe bridge tolerates network blips: a 120-second reconnect grace\nwindow holds in-flight requests open across a brief WebSocket drop,\nand the bridge client buffers any computed response that could not\nbe delivered before the disconnect. Long bank-statement extractions\nthat take 10+ minutes on a small model survive Tailscale or Wi-Fi\nhiccups without losing work.\n\nThe **Test** button also exposes a per-statement progress bar in\n`/finance`: the **Alle auswerten** banner runs through every\nunprocessed Kontoauszug in the background, one by one, and reports\ndone / failed counts as it goes. Lets you start a bulk extraction\non a quiet evening and check the result the next morning.\n\n### C. External OpenAI-compatible endpoint\n\nFor users who run their own inference cluster, or who want to use\nGroq / Together / xAI / OpenRouter / Mistral, pick **OpenAI-compatible**\nin `/settings`, paste the base URL (`https://api.groq.com/openai/v1`,\n`http://192.168.1.50:8080/v1`, …) and an API key if needed.\n\n## Notifications\n\nDocuSort can ping you out of band when a document needs your\nattention. Configure under **Settings → Notifications**:\n\n- **Telegram** — create a bot via [@BotFather](https://t.me/BotFather),\n  send any message to your bot, then visit\n  `https://api.telegram.org/bot\u003cTOKEN\u003e/getUpdates` to find your\n  numeric `chat_id`. Paste both into the form.\n- **Email** — standard SMTP. Works with Gmail (use an\n  [app password](https://support.google.com/accounts/answer/185833)),\n  Fastmail, or your own server.\n\nPer-event toggles control the noise:\n\n- *Document landed in review* — the classifier was unsure or the\n  doc has incomplete metadata.\n- *Classification failed* — the LLM call raised an exception.\n- *Document filed* — every successful filing (off by default — too\n  chatty for normal use).\n- *Bulk job finished* — `analyze-all`, `retry-review`, and friends\n  emit a summary message with the success / failure tally.\n\nEach notification carries a clickable URL back to the document\ndetail page. Channel credentials live in `secrets.yaml` (mode 0600)\nand are never logged.\n\n## Duplicates\n\nA new **Duplicates** page (`/duplicates`) groups every byte-identical\npair in the library by SHA-256 hash and offers a one-click bulk\ntrash action. Pick which copy to keep per group (default: oldest)\nor sweep them all at once. The dashboard shows an amber banner\nwhen groups exist, so you do not have to remember to look.\n\n## Configuration\n\nAll behaviour is controlled by three files in `config/`:\n\n- `config.yaml` – paths, OCR settings, AI provider/model, sync target, thresholds\n- `categories.yaml` – the list of categories and their subcategories\n- `secrets.yaml` – API keys (mode 0600, gitignored). Written by the wizard.\n\nMost users never edit these directly — the **setup wizard** and the\n**`/settings`** page cover everything. The knobs that matter:\n\n| Setting | Default | What it does |\n|---|---|---|\n| `ai.provider` | `anthropic` | `anthropic` · `openai` · `gemini` · `openai_compat` |\n| `ai.model` | `claude-haiku-4-5-20251001` | Provider-specific model id |\n| `ai.base_url` | `\"\"` | Only for `openai_compat` (e.g. `http://localhost:11434/v1` for Ollama) |\n| `ai.min_confidence` | `0.65` | Documents below this go to `_Review` |\n| `ocr.languages` | `deu+eng` | Tesseract language packs |\n| `ocr.max_parallel` | `2` | Cap on concurrent OCR + AI jobs (memory bound) |\n| `sync.target_type` | `local` | `local` (rsync to a folder) or `rclone` (cloud) |\n| `sync.local_path` | `\"\"` | Target folder for local-mode backup |\n| `sync.remote` | `\"\"` | rclone remote, format `\u003cname\u003e:\u003cpath\u003e` |\n| `keep_original` | `true` | Keep an untouched copy of each original in `_Processed` |\n| `dry_run` | `false` | Classify and log but don't move anything |\n\nAfter changing config from the CLI, restart the service. From the UI the\nwizard handles the restart for you.\n\n## CLI flags\n\n```bash\npython -m docusort            # watcher + web UI on the configured port (default 8080)\npython -m docusort --once     # process existing files and exit\npython -m docusort --no-web   # watcher only, no UI\npython -m docusort --dry-run  # classify + log, no moves\npython -m docusort --version\n```\n\n## How it decides\n\n1. File appears in `inbox/`.\n2. Watcher waits until the file size stops changing (default 5 s).\n3. If the PDF has no text layer, `ocrmypdf` adds one.\n4. The first ~12 k characters go to the configured AI provider, together\n   with the category list and the prompt that forces JSON output.\n5. The model replies with strict JSON: `category, subcategory, tags,\n   date, sender, subject, confidence, reasoning`.\n6. Confidence ≥ 0.65 → move to `library/YYYY/Category/Subcategory/`\n   (subcategory dir is omitted when empty). Lower → move to `_Review/`\n   for a human look.\n7. The original is copied to `_Processed/` before being removed from `inbox/`.\n\n## Cost\n\nPer provider (typical one-page letter, ~3 k input + 200 output tokens):\n\n| Provider | Model | Roughly per doc |\n|---|---|---|\n| Anthropic | Haiku 4.5 (with prompt cache) | ~$0.0005 |\n| Anthropic | Sonnet 4.6 (with prompt cache) | ~$0.005 |\n| OpenAI | gpt-4o-mini | ~$0.0008 |\n| OpenAI | gpt-4o | ~$0.015 |\n| Google | Gemini 2.5 Flash | ~$0.0005 |\n| Google | Gemini 2.5 Pro | ~$0.008 |\n| Local | Ollama (any model) | $0 — only your electricity |\n\nA batch of 1 000 documents per month with Haiku 4.5 stays well under\nEUR 1 in API fees. The dashboard shows actual cost across all providers\nin real time, with cache savings (Anthropic) and cached-prompt savings\n(OpenAI) credited.\n\n## Trash, Export, Cloud sync\n\n### Trash\nEvery document detail page has a **move-to-trash** button. Trashed documents\nmove into a `_Trash/` tree that mirrors the category layout on disk and become\nhidden from the dashboard, tree and stats — but stay in the DB so they're\nrecoverable. The library's tree sidebar gets a \"Papierkorb\" entry whenever\nthe trash is non-empty. From there you can restore or permanently purge\nindividual items, or empty the whole trash.\n\n### Export\n- **Dashboard** → \"ZIP laden\" → downloads the whole library as a single ZIP.\n- **Library filtered** → export a single year, a single category, or both.\n- `_Trash/` is excluded by default.\n- The download is streamed, so multi-GB exports don't spike memory.\n\n### Backup\n\nTwo backup paths, picked from the wizard or `/settings` → \"Backup\":\n\n#### Local folder (recommended, zero auth)\n\nMirror the library to any path on the host with rsync — a mounted USB\nstick, NAS share, NFS/SMB mount, second disk. No tokens, no OAuth.\n\nIn the UI: pick the **\"Lokaler Ordner / NAS-Mount\"** tile, browse to the\nfolder with the built-in folder picker (or paste a path), enable. Equivalent\nconfig:\n\n```yaml\nsync:\n  enabled: true\n  target_type: local\n  local_path: /mnt/backup/docusort\n```\n\nBacked by `rsync -a --delete --delete-excluded --exclude=_Trash/`. If\nrsync isn't installed, DocuSort falls back to a slower pure-Python copy.\n\n#### Cloud (rclone)\n\nDocuSort uses [rclone](https://rclone.org/) for cloud sync — whatever\nrclone supports, DocuSort can sync to. **Headless-friendly**: no\nbrowser needed on the host. On the machine running DocuSort:\n\n```bash\nsudo apt install rclone     # Debian/Ubuntu\nbrew install rclone         # macOS\n```\n\nThen in **`/settings` → Backup → Cloud-Speicher (rclone)**:\n\n- **WebDAV / Nextcloud · SFTP · S3 / R2 / MinIO**: simple form — URL,\n  credentials, done. No OAuth, works on any headless machine.\n- **Google Drive · Dropbox · OneDrive** (folded behind \"Show OAuth\n  providers\"): the only flow that needs OAuth. On a separate machine\n  *with* a browser, run e.g. `rclone authorize \"drive\"` — it spawns a\n  one-shot OAuth dance, prints a JSON token. Paste that token into the\n  textarea in the UI; DocuSort writes the remote into `rclone.conf`\n  for you. No `rclone config` interaction on the host.\n\nA \"Test\" button next to each remote runs `rclone lsd \u003cremote\u003e:` so you\ncatch broken auth before flipping `enabled: true`. Broken OAuth remotes\n(empty `token` field in `rclone.conf`) get a red \"defekt\" badge with a\none-click **Reconnect** button that re-opens the token-paste form.\n\nFor scheduled sync, point a systemd timer at\n`curl -XPOST http://localhost:\u003cport\u003e/api/sync/run` (port defaults to 8080;\nthe value lives under `web.port` in `config.yaml`).\n\n## Roadmap\n\n- ~~Etappe 2: Web UI, cost tracking, SQLite + FTS5 search~~ — shipped in **v0.2.0**\n- ~~Local-AI bridge so a small NAS can offload inference to a beefier\n  desktop on the same network~~ — shipped in **v0.19.0** + robustness\n  pass in **v0.21.0**\n- ~~Etappe 3: Telegram / email notification on new file or `_Review` entry~~ — shipped in **v0.22.0**\n- ~~Etappe 4: Duplicate detection across the whole library~~ — shipped in **v0.22.0**\n- ~~Etappe 6: Prompt caching for bulk imports (reuse system prompt across calls)~~ — Anthropic ephemeral cache, already shipped earlier\n- Etappe 5: Automatic reminders for contract termination dates\n\n## License\n\nProprietary — see [`LICENSE`](LICENSE).\n\nDocuSort is source-available but not open source. You may download, install\nand run it for personal, non-commercial use, and read the source code for\ninspection and security review. Modification, redistribution, derivative\nworks, and commercial use require prior written permission from the\ncopyright holder.\n\nVersions up to and including **v0.12.3 are still available under the MIT\nLicense** for anyone who obtained a copy of those releases — that does\nnot change retroactively. The proprietary terms apply to v0.12.4 and\nlater.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frobeertm%2Fdocusort","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frobeertm%2Fdocusort","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frobeertm%2Fdocusort/lists"}