An open API service indexing awesome lists of open source software.

https://github.com/anym001/pocketlog-importer

Companion importer for PocketLog — parses bank CSV exports, applies a rules whitelist, imports via the PocketLog API.
https://github.com/anym001/pocketlog-importer

automation bank-statements csv docker docker-image etl importer personal-finance pocketlog python self-hosted

Last synced: about 18 hours ago
JSON representation

Companion importer for PocketLog — parses bank CSV exports, applies a rules whitelist, imports via the PocketLog API.

Awesome Lists containing this project

README

          

# PocketLog Importer

[![Tests](https://img.shields.io/github/actions/workflow/status/anym001/pocketlog-importer/test.yml?label=Tests)](https://github.com/anym001/pocketlog-importer/actions/workflows/test.yml)
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](https://github.com/anym001/pocketlog-importer/blob/HEAD/LICENSE)
[![Release](https://img.shields.io/github/v/release/anym001/pocketlog-importer?label=Release)](https://github.com/anym001/pocketlog-importer/releases)
[![GHCR](https://img.shields.io/badge/GHCR-pocketlog--importer-2496ED?logo=docker&logoColor=white)](https://github.com/anym001/pocketlog-importer/pkgs/container/pocketlog-importer)
[![Docker Hub](https://img.shields.io/badge/Docker%20Hub-pocketlog--importer-2496ED?logo=docker&logoColor=white)](https://hub.docker.com/r/anym001/pocketlog-importer)

A small Docker container that turns bank CSV exports (**easybank**, **dadat**)
into [PocketLog](https://github.com/anym001/pocketlog) transactions. You drop a
CSV into a folder, a rules whitelist decides what gets imported (description,
category, tags), and the result is pushed to PocketLog via its CSV import API.

Images are published to **GHCR** and **Docker Hub** — use whichever you prefer
(`image:` in your compose file):

```
ghcr.io/anym001/pocketlog-importer:
anym001/pocketlog-importer: # Docker Hub
```

## How it works

```
bank export ─▶ /data/input ─▶ parse ─▶ rules.yaml (whitelist) ─▶ /data/output ─▶ PocketLog API

└─ no match ─▶ .unmatched.csv (review)
```

1. The container runs an **internal scheduler** (cron, default hourly).
2. Each `*.csv` in `/data/input` is auto-detected (easybank / dadat), parsed and
normalised (amount always positive, direction in `type`).
3. Every booking is matched against `rules.yaml` (regex, case-insensitive,
first match wins). **A booking that matches no rule is dropped** — only
curated bookings reach PocketLog. Dropped bookings are written to a
`*.unmatched.csv` for review so you can add a rule later.
4. Matched bookings are written to `/data/output/-.csv` and imported
via `POST /api/import/csv`. PocketLog deduplicates, so re-runs are safe.
Transient API failures (network errors, 5xx, 429) are retried with
exponential backoff before a file counts as failed.
5. The processed original is moved to `/data/processed/`. Files that fail to
parse or import go to `/data/failed/`.

## Quick start

1. **Create an API key** in PocketLog (UI → API keys) with the **`import`** scope.
2. Prepare the host folders and start the container (see
`docker/docker-compose.example.yml`):
```sh
mkdir -p config data/input
POCKETLOG_API_KEY=plk_xxx docker compose -f docker/docker-compose.example.yml up -d
```
On first start the container seeds `config/config.yaml` and
`config/rules.yaml` from the bundled examples if they are missing (it logs a
WARNING). Then edit `config/config.yaml` → set `pocketlog.base_url`, edit
`config/rules.yaml` to match your bookings, and restart.
> To configure **before** the first start instead, copy the examples
> yourself: `cp config/config.example.yaml config/config.yaml` and
> `cp config/rules.example.yaml config/rules.yaml`.
3. Drop a bank CSV into `data/input/`. The scheduler picks it up; or trigger it
immediately:
```sh
docker exec pocketlog-importer pocketlog-import --once
```

### Try it safely first (dry-run)

`--dry-run` writes the output CSVs but does **not** import anything:

```sh
docker exec pocketlog-importer pocketlog-import --once --dry-run
```

## Triggering

Three equivalent ways to run the pipeline:

| Method | Command |
|---|---|
| Automatic | internal scheduler (`schedule.cron` in `config.yaml`) |
| On demand | `docker exec pocketlog-importer pocketlog-import --once` |
| Test | `... pocketlog-import --once --dry-run` |

The `--once` path is ideal for **Unraid User Scripts**. A file lock prevents a
manual run from overlapping with a scheduler tick.

### Container health

Every run (idle ones included) touches a heartbeat file. The image's
`HEALTHCHECK` runs `pocketlog-import --healthcheck`, which reports unhealthy
once the heartbeat is older than ~2 cron intervals — so a wedged scheduler
shows up directly in `docker ps` / the Unraid dashboard instead of going
unnoticed. The threshold adapts to `schedule.cron` automatically.

## Configuration

`config/config.yaml` — see [`config/config.example.yaml`](https://github.com/anym001/pocketlog-importer/blob/HEAD/config/config.example.yaml).
The PocketLog **API key is never stored in YAML**; provide it via the
`POCKETLOG_API_KEY` environment variable.

### Rules

`config/rules.yaml` — see [`config/rules.example.yaml`](https://github.com/anym001/pocketlog-importer/blob/HEAD/config/rules.example.yaml).

```yaml
rules:
- match: "STREAMINGCO" # regex, case-insensitive, tested against booking text
description: "Streaming Service" # overrides description (default: raw booking text)
category: "Entertainment" # PocketLog category (auto-created if new)
tags: [subscription] # tags (auto-created if new)
# type: in # optional, overrides the amount-sign direction
# bank: easybank # optional, restrict to one parser
```

Rules are evaluated top to bottom; the **first** matching rule wins.

### Notifications

Optional push notifications about run outcomes via any **Gotify-compatible**
endpoint — this includes [PushBits](https://github.com/pushbits/server)
(relays to Matrix) and Gotify itself. Off unless `notify.url` is set:

```yaml
notify:
type: gotify # PushBits + Gotify
url: https://pushbits.example.com
events: problems # problems (default) | always
```

The application token goes into the `NOTIFY_TOKEN` environment variable —
never into YAML. `events: problems` notifies only on failed files, unmatched
bookings, or a crashed run (high priority); `events: always` also reports
clean runs. Idle runs (empty input directory) and dry-runs never notify, and
notifications carry only counters and filenames — no booking data.
Notification delivery is best-effort: a failed push is logged and never
affects the import itself.

## Environment variables

| Variable | Default | Purpose |
|---|---|---|
| `POCKETLOG_API_KEY` | — | **Required** for real imports (`import` scope key) |
| `POCKETLOG_BASE_URL` | — | Optional override of `pocketlog.base_url` |
| `NOTIFY_TOKEN` | — | Application token for `notify.url` (PushBits/Gotify) |
| `PUID` / `PGID` | `1000` | Ownership of `/config` + `/data` (Unraid: `99` / `100`) |
| `LOG_LEVEL` | `INFO` | Log verbosity |
| `LOG_FORMAT` | `text` | Log format: `text` or `json` (one JSON object per line) |
| `LOG_FILE` | — | Optional rotating log file, e.g. `/config/logs/importer.log` |
| `LOG_FILE_MAX_BYTES` | `1048576` | Rotation size |
| `LOG_FILE_BACKUPS` | `5` | Rotated copies kept |

## Volumes

| Path | Contents |
|---|---|
| `/config` | `config.yaml`, `rules.yaml`, optional `logs/` |
| `/data/input` | drop bank CSVs here |
| `/data/output` | generated PocketLog CSVs + `*.unmatched.csv` |
| `/data/processed` | successfully processed originals, one subdirectory per run |
| `/data/failed` | files that failed to parse or import, one subdirectory per run |

## Supported banks

| Bank | File | Format |
|---|---|---|
| easybank | `EASYBANK_Umsatzliste_*.csv` | no header, 6 cols, `DD.MM.YYYY`, `-13,99` |
| dadat | `umsaetzegirokonto_*.csv` | header, 27 cols, `YYYY-MM-DD`, `-200,00` |

Adding a bank = a new parser in `pocketlog_importer/parsers/` (implement `sniff` +
`parse`) registered in `parsers/__init__.py`.

## Development

```sh
python -m venv .venv && . .venv/bin/activate
pip install -r requirements-dev.txt && pip install -e .
```

Lint and test commands (= CI) and the branching/release flow are in
[`CONTRIBUTING.md`](https://github.com/anym001/pocketlog-importer/blob/HEAD/CONTRIBUTING.md).

### Contract tests

`tests/integration/` runs the real pipeline against a real PocketLog container
and pins the import API contract (round-trip, dedup idempotency, per-row error
format, auth scopes). Requires Docker; excluded from the default `pytest -q`
run:

```sh
pytest -m integration # released image
POCKETLOG_IMAGE=ghcr.io/anym001/pocketlog:dev pytest -m integration
```

CI runs them on every PR against the released image, and nightly against
`:latest` + `:dev` (`contract.yml`) to catch contract drift from the PocketLog
side before it is released.

## License

Licensed under the GNU Affero General Public License v3.0 or later
(AGPL-3.0-or-later), the same license as the companion
[`pocketlog`](https://github.com/anym001/pocketlog) project. See
[`LICENSE`](https://github.com/anym001/pocketlog-importer/blob/HEAD/LICENSE) for the full text.

---

Built with [Claude Code](https://claude.com/claude-code).