An open API service indexing awesome lists of open source software.

https://github.com/prbento/personal_finance_ai

LLM-powered personal finance ERP via Telegram. Dual-agent AI pipeline, PostgreSQL AP/AR ledger, FastAPI webhook.
https://github.com/prbento/personal_finance_ai

ai-agents data-engineering fastapi groq llm personal-finance postgresql python telegram-bot

Last synced: about 2 months ago
JSON representation

LLM-powered personal finance ERP via Telegram. Dual-agent AI pipeline, PostgreSQL AP/AR ledger, FastAPI webhook.

Awesome Lists containing this project

README

          

# ๐Ÿ’ฐ Zotto โ€” Finance AI Data App: LLM-Powered Personal ERP

[![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge&logo=python&logoColor=ffdd54)](https://python.org)
[![PostgreSQL](https://img.shields.io/badge/postgresql-4169e1?style=for-the-badge&logo=postgresql&logoColor=white)](https://postgresql.org)
[![Telegram](https://img.shields.io/badge/Telegram-2CA5E0?style=for-the-badge&logo=telegram&logoColor=white)](https://telegram.org)
[![Railway](https://img.shields.io/badge/Railway-131415?style=for-the-badge&logo=railway&logoColor=white)](https://railway.app)
[![Groq](https://img.shields.io/badge/Groq-f55036?style=for-the-badge&logo=groq&logoColor=white)](https://groq.com)
[![FastAPI](https://img.shields.io/badge/FastAPI-005571?style=for-the-badge&logo=fastapi)](https://fastapi.tiangolo.com)
[![Streamlit](https://img.shields.io/badge/Streamlit-FF4B4B?style=for-the-badge&logo=streamlit&logoColor=white)](https://streamlit.io)

*(Para a versรฃo em Portuguรชs, [clique aqui](#-versรฃo-em-portuguรชs-brasileiro))*

## ๐Ÿ‘จโ€๐Ÿ’ป Author
**Bento** โ€” GitHub: [@prBento](https://github.com/prBento)

---

## ๐Ÿ‡บ๐Ÿ‡ธ English Version

### ๐ŸŽฏ About the Project

**Zotto** is a Full-Stack Data Application acting as a **personal financial ERP**. It uses Large Language Models to ingest unstructured daily inputs โ€” free-text messages, electronic invoice URLs, and complex PDF bills โ€” and transforms them into a strictly governed relational PostgreSQL database with full Accounts Payable/Receivable tracking, a real-time Cash Flow Statement, and a Streamlit BI dashboard for financial intelligence.

๐Ÿค **AI Collaboration Note:** Product vision, business rules, and architectural decisions by me. Code development through pair-programming with **Gemini AI** (Google) and **Claude** (Anthropic).

---

### ๐Ÿ—บ๏ธ System Architecture โ€” Message Flow

Every message follows a deterministic path from Telegram to the database. Here's how:

```text
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ TELEGRAM USER โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ Text / URL โ”‚ PDF document โ”‚ /command
โ–ผ โ–ผ โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ security_check decorator (ALLOWED_CHAT_IDS) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ ingestion โ”‚ /contas /extrato /help
โ–ผ โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ PostgreSQL โ”‚ โ”‚ Command handlers โ”‚
โ”‚ process_queue โ”‚ โ”‚ (direct DB reads) โ”‚
โ”‚ status=PENDING โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ every 10s
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ queue_processor (worker) โ”‚ โ† rate limit? reschedule with backoff
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ URL โ”‚ PDF / text
โ–ผ โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚BeautifulSโ”‚ โ”‚PyPDF text โ”‚
โ”‚oup scrapeโ”‚ โ”‚extraction โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Agent 1 โ€” Extract โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Agent 2 โ€” Enrich โ”‚
โ”‚ temp=0.0 โ”‚ โ”‚ temp=0.1 โ”‚
โ”‚ CoT date reasoning โ”‚ โ”‚ disambiguation rules โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Math validation โ”‚
โ”‚ discount detector โ”‚
โ”‚ duplicate check โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ State Machine โ”‚
โ”‚ โ†’ ask method / location โ”‚โ—€โ”€โ”€โ”€ user replies
โ”‚ โ†’ ask card / first date โ”‚
โ”‚ โ†’ show summary (Sim/Nรฃo) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ confirmed
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ PostgreSQL โ”‚
โ”‚ transactions โ”‚โ—€โ”€โ”€โ”€โ”€ Streamlit
โ”‚ transaction_items โ”‚ dashboard.py
โ”‚ installments โ”‚ reads here
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```

**Deployment modes:**
- `prod` โ†’ FastAPI + Uvicorn โ†’ Telegram pushes to `POST /webhook`
- `dev` โ†’ `run_polling()` โ†’ bot asks Telegram every few seconds
- `dashboard` โ†’ Streamlit service on Railway, same PostgreSQL plugin

---

### ๐ŸŒŸ Key Features

- **Multimodal Ingestion:** Free-text, NFC-e URLs, and PDF utility bills in one pipeline.
- **Dual-Agent AI:** Agent 1 extracts (`temp=0.0`); Agent 2 categorizes (`temp=0.1`). Disambiguation ruleset prevents common misclassifications (Total Pass โ†’ Academy, iFood โ†’ Food, streaming โ†’ Subscriptions, NF-e โ†’ always Expense).
- **Hidden Discount Detection:** If `sum(items) > invoice_total`, the difference is automatically registered as a discount.
- **Resilient Outbox Queue:** Exponential Backoff (60sโ€“3600s), TPD-aware 90-minute deferral, `max_attempts` dead-item protection, busy-state deferral without consuming retry attempts.
- **AP/AR Dashboard (`/contas`):** Accordion credit card grouping, income vs expense differentiation, smart anticipation logic, dynamic method override, overdue alerts, Fast-Forward, Isolated View.
- **Cash Flow Statement (`/extrato`):** Saldo Atual + Projetado, Benefit Wallet isolation (VA/VR), dynamic installment index (`8/10`), `[B]` tag, `*` for pending items.
- **Streamlit BI Dashboard (`dashboard.py`):**
- **Saรบde do Mรชs** โ€” KPIs with savings rate, correct cash-basis values (`paid_amount` for PAID, `expected_amount` for PENDING), benefit wallet isolation.
- **Tendรชncias** โ€” Monthly income/expense series, savings rate evolution, category trends (multi-select), card participation breakdown, accumulated discount savings.
- **Cartรตes & Parcelas** โ€” Income commitment gauge (adjustable horizon), debt curve (burn rate), active installment drill-down.
- **Projeรงรฃo de Caixa** โ€” Projected monthly + cumulative balance, tabular summary.
- **Operacional โ€” Itens** โ€” Hierarchical treemap (macro โ†’ category โ†’ subcategory), sunburst drill-down, top items & brands, frequency vs ticket scatter, day-of-month heatmap, full audit table with triple filters.
- **Cash Basis Accounting:** Paid installment `month` updates to payment month. `/extrato` and Streamlit reflect when money moved.
- **Cloud-Native:** Two Railway services sharing one PostgreSQL plugin.

---

### ๐Ÿ› ๏ธ Tech Stack

| Layer | Technology | Version |
|-------|-----------|---------|
| Language | Python | 3.12 |
| Conversational Interface | `python-telegram-bot` | 20.8 |
| Web Server (prod) | FastAPI + Uvicorn | 0.135.3 / 0.44.0 |
| AI Engine | Groq API (`llama-4-scout-17b`) | โ€” |
| Database | PostgreSQL (Docker / Railway) | 15 |
| DB Driver | `psycopg2-binary` | 2.9.11 |
| BI Dashboard | Streamlit + Plotly | โ€” |
| Web Scraping | `BeautifulSoup4` | 4.14.3 |
| PDF Extraction | `PyPDF` | 6.9.1 |
| Date Arithmetic | `python-dateutil` | 2.9.0 |

---

### ๐Ÿค– Creating your Telegram Bot

1. Open Telegram โ†’ search `@BotFather` โ†’ `/newbot` โ†’ copy the **HTTP API Token**.
2. Send any message to your new bot, then talk to `@userinfobot` to find your personal `chat_id` for `ALLOWED_CHAT_IDS`.

---

### ๐Ÿš€ How to Run Locally

**Prerequisites:** Python 3.12, Docker, Groq API key ([console.groq.com](https://console.groq.com)).

1. **Clone:** `git clone https://github.com/prBento/personal_finance_ai.git && cd personal_finance_ai`

2. **Create `.env`** (never commit):
```env
ENVIRONMENT=dev
TELEGRAM_TOKEN_DEV=your_dev_bot_token
TELEGRAM_TOKEN_PROD=your_prod_bot_token
GROQ_API_KEY_DEV=your_dev_groq_key
GROQ_API_KEY_PROD=your_prod_groq_key
DB_USER=your_db_user
DB_PASSWORD=your_db_password
DB_NAME=db_finance
DATABASE_URL=postgresql://${DB_USER}:${DB_PASSWORD}@localhost:5432/${DB_NAME}
ALLOWED_CHAT_IDS=your_telegram_chat_id
RAILWAY_DB_URL=postgresql://postgres:password@host:5432/railway
```

3. **Start DB:** `docker-compose up -d`

4. **Run bot:** `python -m venv venv && source venv/bin/activate && pip install -r requirements.txt && python bot.py`

5. **Run dashboard (separate terminal):** `streamlit run dashboard.py`

6. **Sync Production Data to Local (Optional):** To test the dashboard locally with real data, use the sync script. It securely consumes credentials from your `.env` file. Ensure you have `RAILWAY_DB_URL` added to your `.env`, then run in PowerShell:
```powershell
.\sync_db.ps1
```
*This script creates a disposable container that downloads production data and injects it directly into your local database in memory, without creating files and preserving UTF-8 formatting.*

7. **Test Telegram Mini App Locally (Optional):** To test the `/dash` command locally, you must expose your Streamlit port securely. Download **ngrok**, run `.\ngrok http 0000`, copy the generated `https://...ngrok-free.app` URL, and set it as `DASHBOARD_URL` in your `.env`.

---

### โ˜๏ธ Cloud Deployment (Railway)

The project runs as **two independent Railway services** sharing a single PostgreSQL plugin.

#### Service 1 โ€” Bot (FastAPI + Webhook)

1. Create a Railway project โ†’ add **PostgreSQL** plugin.
2. Connect your GitHub repo. Railway detects `.python-version` (Python 3.12) and installs `requirements.txt` automatically.
3. In the service **Variables** tab, add:
- `ENVIRONMENT=prod`
- `TELEGRAM_TOKEN_PROD`, `GROQ_API_KEY_PROD`
- `DATABASE_URL` (use Railway's **internal** URL from the PostgreSQL plugin)
- `ALLOWED_CHAT_IDS`
4. Ensure the `Procfile` reads `web: python bot.py` (not `worker`) so Railway assigns a public URL and the `PORT` variable for the webhook server.
5. After deploy, register the webhook with Telegram:
```
[https://api.telegram.org/bot](https://api.telegram.org/bot)/setWebhook?url=https:///webhook
```

#### Service 2 โ€” Dashboard (Streamlit)

1. In the **same Railway project**, click **+ New Service โ†’ GitHub Repo** and connect the same repository again (Railway allows multiple services per repo).
2. In the new service's **Settings โ†’ Start Command**, set:
```
streamlit run dashboard.py --server.port $PORT --server.address 0.0.0.0
```
3. In the service **Variables** tab, add only:
- `DATABASE_URL` (same internal URL from the PostgreSQL plugin โ€” both services share it)
4. Optionally set a custom domain or use the Railway-generated URL to access the dashboard.
5. The dashboard connects directly to the same PostgreSQL instance the bot writes to โ€” no extra configuration needed.

---

### ๐Ÿ—‚๏ธ Project Structure

```text
personal_finance_ai/
โ”œโ”€โ”€ bot.py # Handlers, State Machine, queue worker, AI pipeline, FastAPI server
โ”œโ”€โ”€ database.py # All DB functions, connection pool, CTE queries, table creation
โ”œโ”€โ”€ dashboard.py # Streamlit BI dashboard (5 analytical tabs)
โ”œโ”€โ”€ prompts.py # AI Prompts (Extraction & Enrichment)
โ”œโ”€โ”€ Procfile # Railway bot service: "web: python bot.py"
โ”œโ”€โ”€ docker-compose.yml # Local PostgreSQL
โ”œโ”€โ”€ requirements.txt # Python dependencies (includes streamlit, plotly)
โ”œโ”€โ”€ sync_db.ps1 # PowerShell script to sync production DB to local DB
โ”œโ”€โ”€ .python-version # Forces Python 3.12 on Railway Nixpacks
โ”œโ”€โ”€ ARCHITECTURE.md # Full technical specification
โ”œโ”€โ”€ BACKLOG.md # Product backlog and roadmap
โ””โ”€โ”€ .env # Secrets (git-ignored)
```

---

### ๐Ÿšฆ Conventional Commits

| Prefix | Use for |
|--------|---------|
| `feat:` | New feature | `fix:` | Bug fix |
| `refactor:` | No behavior change | `docs:` | Documentation |
| `chore:` | Build or config | | |

---

### ๐Ÿ—บ๏ธ Development Roadmap

#### โœ… V1 โ€” Production Foundation
Core ingestion, Outbox + Backoff, NFC-e + PDF, installment engine, connection pool, whitelist, DATE columns.

#### โœ… V2 โ€” Accounting Engine & UX
- Accordion AP/AR dashboard with group invoice payment.
- `/extrato` with cash-basis accounting, benefit wallet, installment index (`8/10`).
- Dynamic payment method override at settlement time.
- Credit card anticipation (moves installment to next invoice cycle, stays PENDING).
- FastAPI webhook architecture. Interactive `/help` menu.
- Hidden discount detector. Disambiguation ruleset.

#### โœ… V3 โ€” Scale & Visualization
- Streamlit BI dashboard on Railway (second service, shared PostgreSQL).
- 5-tab analytical dashboard: Saรบde do Mรชs, Tendรชncias, Cartรตes & Parcelas, Projeรงรฃo de Caixa, Operacional.
- Correct cash-basis KPIs (`paid_amount` for PAID), savings rate metric.
- Benefit wallet isolation in Streamlit (same logic as `/extrato`).
- Hierarchical item analysis: treemap, sunburst, frequencyร—ticket scatter, day heatmap.
- Accumulated discount/anticipation savings curve.
- Income commitment gauge with adjustable horizon slider.
- Blacklist filter for locations (starts empty, select items to exclude).
- Extract prompts to `prompts.py`.

#### ๐Ÿšง V4 โ€” Hardening & Intelligence
- [ ] Replace `print()` with `logging` module for structured log levels.
- [ ] Multi-transaction support per LLM response.
- [ ] PDF password decryption mid-conversation.
- [ ] Replace `psycopg2` with `asyncpg` (non-blocking DB calls in FastAPI event loop).
- [ ] Budget targets per category (stored in DB, configurable via dashboard).

---
---

## ๐Ÿ‡ง๐Ÿ‡ท Versรฃo em Portuguรชs Brasileiro

### ๐ŸŽฏ Sobre o Projeto

**Zotto** รฉ uma Aplicaรงรฃo de Dados Full-Stack que atua como um **ERP financeiro pessoal**. Usa LLMs para ingerir inputs nรฃo estruturados do dia a dia โ€” mensagens de texto livre, URLs de notas fiscais (NFC-e) e PDFs complexos de contas โ€” e os transforma em um banco de dados PostgreSQL rigidamente governado. O projeto rastreia Contas a Pagar/Receber, gera um Extrato de Fluxo de Caixa em tempo real e fornece um Dashboard BI no Streamlit para inteligรชncia financeira.

๐Ÿค **Colaboraรงรฃo IA:** Decisรตes de produto, regras de negรณcio e arquitetura por mim. Cรณdigo desenvolvido em pair-programming com **Gemini AI** (Google) e **Claude** (Anthropic).

---

### ๐Ÿ—บ๏ธ Arquitetura do Sistema โ€” Fluxo de Mensagens

Toda mensagem segue um caminho determinรญstico do Telegram atรฉ o banco de dados. Veja como funciona:

```text
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ USUรRIO TELEGRAM โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ Texto / URL โ”‚ Documento PDF โ”‚ /comandos
โ–ผ โ–ผ โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Decorator security_check (ALLOWED_CHAT_IDS) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ Ingestรฃo โ”‚ /contas /extrato /help
โ–ผ โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ PostgreSQL โ”‚ โ”‚ Handlers de comando โ”‚
โ”‚ process_queue โ”‚ โ”‚ (leitura direta BD) โ”‚
โ”‚ status=PENDING โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ a cada 10s
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ queue_processor (worker) โ”‚ โ† rate limit? reagenda com backoff
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ URL โ”‚ PDF / texto
โ–ผ โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚Scrape viaโ”‚ โ”‚Extraรงรฃo de โ”‚
โ”‚BeautifulSโ”‚ โ”‚texto PyPDF โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Agente 1 โ€” Extraรงรฃo โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Agente 2 โ€” Enriquec. โ”‚
โ”‚ temp=0.0 โ”‚ โ”‚ temp=0.1 โ”‚
โ”‚ CoT datas โ”‚ โ”‚ regras desambiguaรงรฃo โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Validaรงรฃo matemรกtica โ”‚
โ”‚ detector de desconto โ”‚
โ”‚ verificaรงรฃo duplicata โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Mรกquina de Estados โ”‚
โ”‚ โ†’ pede mรฉtodo / local โ”‚โ—€โ”€โ”€โ”€ usuรกrio responde
โ”‚ โ†’ pede cartรฃo / 1ยช data โ”‚
โ”‚ โ†’ mostra resumo (Sim/Nรฃo) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ confirmado
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ PostgreSQL โ”‚
โ”‚ transactions โ”‚โ—€โ”€โ”€โ”€โ”€ Streamlit
โ”‚ transaction_items โ”‚ dashboard.py
โ”‚ installments โ”‚ lรช aqui
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```

**Modos de Deploy:**
- `prod` โ†’ FastAPI + Uvicorn โ†’ Telegram envia via `POST /webhook`
- `dev` โ†’ `run_polling()` โ†’ bot pesquisa ativamente no Telegram
- `dashboard` โ†’ Serviรงo Streamlit no Railway, usando o mesmo plugin PostgreSQL

---

### ๐ŸŒŸ Funcionalidades Principais

- **Ingestรฃo Multimodal:** Texto livre, URLs de NFC-e e faturas em PDF em um รบnico pipeline.
- **IA de Duplo Agente:** Agente 1 extrai dados (`temp=0.0`); Agente 2 categoriza (`temp=0.1`). Regras de desambiguaรงรฃo evitam erros comuns (ex: Total Pass โ†’ Academia, iFood โ†’ Alimentaรงรฃo, NF-e โ†’ sempre Despesa).
- **Detecรงรฃo de Desconto Oculto:** Se a `soma(itens) > total_nota`, a diferenรงa รฉ automaticamente registrada como desconto aplicado.
- **Fila Outbox Resiliente:** Backoff Exponencial (60sโ€“3600s), adiamento de 90 min para limite TPD, proteรงรฃo contra limite de tentativas (`max_attempts`), pausa de fila sem consumir tentativas se o usuรกrio estiver respondendo.
- **Dashboard AP/AR (`/contas`):** Agrupamento por cartรฃo em acordeon, diferenciaรงรฃo de receitas/despesas, lรณgica de antecipaรงรฃo, mudanรงa de mรฉtodo de pagamento na hora da baixa, alertas de vencimento, avanรงo rรกpido e Visรฃo Isolada.
- **Extrato Financeiro (`/extrato`):** Saldo Atual vs Projetado, isolamento da Carteira de Benefรญcios (VA/VR), รญndice dinรขmico de parcelas (`8/10`), tag `[B]`, e `*` para lanรงamentos previstos.
- **Dashboard BI Streamlit (`dashboard.py`):**
- **Saรบde do Mรชs** โ€” KPIs com taxa de poupanรงa, valores corretos em regime de caixa (`paid_amount` para PAGO, `expected_amount` para PENDENTE), isolamento de benefรญcios.
- **Tendรชncias** โ€” Sรฉrie mensal de receitas/despesas, evoluรงรฃo da poupanรงa, tendรชncias por categoria (multi-select), participaรงรฃo por cartรฃo e economia acumulada.
- **Cartรตes & Parcelas** โ€” Gauge de comprometimento de renda (horizonte ajustรกvel), curva de dรญvida (burn rate) e detalhamento de parcelamentos ativos.
- **Projeรงรฃo de Caixa** โ€” Saldo projetado mensal + acumulado e resumo em tabela.
- **Operacional (Itens)** โ€” Treemap hierรกrquico (macro โ†’ categoria โ†’ subcategoria), sunburst, top itens e marcas, scatter de frequรชncia vs ticket, heatmap por dia do mรชs e tabela de auditoria com filtros triplos.
- **Contabilidade em Regime de Caixa:** O `mรชs` da parcela se ajusta ao mรชs de pagamento real. O `/extrato` e o Streamlit refletem a movimentaรงรฃo exata do dinheiro.
- **Cloud-Native:** Dois serviรงos no Railway compartilhando um รบnico plugin PostgreSQL.

---

### ๐Ÿ› ๏ธ Stack Tecnolรณgico

| Camada | Tecnologia | Versรฃo |
|--------|-----------|--------|
| Linguagem | Python | 3.12 |
| Interface Bot | `python-telegram-bot` | 20.8 |
| Servidor Web (prod) | FastAPI + Uvicorn | 0.135.3 / 0.44.0 |
| Motor IA | Groq API (`llama-4-scout-17b`) | โ€” |
| Banco de Dados | PostgreSQL (Docker / Railway) | 15 |
| Driver BD | `psycopg2-binary` | 2.9.11 |
| Dashboard BI | Streamlit + Plotly | โ€” |
| Web Scraping | `BeautifulSoup4` | 4.14.3 |
| Leitura PDF | `PyPDF` | 6.9.1 |
| Aritmรฉtica de Datas | `python-dateutil` | 2.9.0 |

---

### ๐Ÿค– Criando seu Bot no Telegram

1. Abra o Telegram โ†’ busque por `@BotFather` โ†’ digite `/newbot` โ†’ copie o **Token da API HTTP**.
2. Envie qualquer mensagem para o seu novo bot, e em seguida converse com o `@userinfobot` para descobrir o seu `chat_id` pessoal. Insira esse nรบmero no seu `ALLOWED_CHAT_IDS`.

---

### ๐Ÿš€ Como Rodar Localmente

**Prรฉ-requisitos:** Python 3.12, Docker, Chave de API do Groq ([console.groq.com](https://console.groq.com)).

1. **Clonar:** `git clone https://github.com/prBento/personal_finance_ai.git && cd personal_finance_ai`

2. **Criar `.env`** (nunca faรงa commit):
```env
ENVIRONMENT=dev
TELEGRAM_TOKEN_DEV=seu_token_dev
TELEGRAM_TOKEN_PROD=seu_token_prod
GROQ_API_KEY_DEV=sua_chave_groq_dev
GROQ_API_KEY_PROD=sua_chave_groq_prod
DB_USER=seu_usuario_bd
DB_PASSWORD=sua_senha_bd
DB_NAME=db_finance
DATABASE_URL=postgresql://${DB_USER}:${DB_PASSWORD}@localhost:5432/${DB_NAME}
ALLOWED_CHAT_IDS=seu_chat_id_telegram
RAILWAY_DB_URL=postgresql://postgres:senha@host:5432/railway
```

3. **Subir BD Local:** `docker-compose up -d`

4. **Rodar bot:** `python -m venv venv && source venv/bin/activate && pip install -r requirements.txt && python bot.py`

5. **Rodar dashboard (em outro terminal):** `streamlit run dashboard.py`

6. **Sincronizar Banco de Produรงรฃo (Opcional):** Para testar o painel localmente com dados reais, utilize o script de sincronizaรงรฃo. Ele consome as credenciais do seu arquivo `.env` de forma segura. Certifique-se de ter adicionado a variรกvel `RAILWAY_DB_URL` no seu `.env` e rode no PowerShell:
```powershell
.\sync_db.ps1
```
*O script gera um container descartรกvel que baixa os dados de produรงรฃo e os injeta diretamente no seu banco local em memรณria, sem gerar arquivos e preservando o formato UTF-8.*

7. **Testar Web App Localmente (Opcional):** Para testar o comando `/dash`, os Mini Apps do Telegram exigem uma URL HTTPS segura. Baixe o **ngrok**, exponha a porta do Streamlit rodando `.\ngrok http 0000`, copie a URL gerada e adicione como `DASHBOARD_URL` no seu `.env`.

---

### โ˜๏ธ Deploy na Nuvem (Railway)

O projeto roda como **dois serviรงos independentes** no mesmo projeto Railway, compartilhando um รบnico plugin PostgreSQL.

#### Serviรงo 1 โ€” Bot (FastAPI + Webhook)

1. Crie um projeto no Railway โ†’ adicione o plugin **PostgreSQL**.
2. Conecte o repositรณrio GitHub. O Railway detecta o `.python-version` (Python 3.12) e instala o `requirements.txt` automaticamente.
3. Na aba **Variables** do serviรงo, adicione:
- `ENVIRONMENT=prod`
- `TELEGRAM_TOKEN_PROD`, `GROQ_API_KEY_PROD`
- `DATABASE_URL` (URL **interna** do plugin PostgreSQL do Railway)
- `ALLOWED_CHAT_IDS`
4. Garanta que o arquivo `Procfile` contenha `web: python bot.py` para o Railway provisionar a URL pรบblica e a variรกvel `PORT` para o servidor webhook.
5. Apรณs o deploy, registre o webhook enviando este link no seu navegador:
```
[https://api.telegram.org/bot](https://api.telegram.org/bot)/setWebhook?url=https:///webhook
```

#### Serviรงo 2 โ€” Dashboard (Streamlit)

1. No **mesmo projeto Railway**, clique em **+ New Service โ†’ GitHub Repo** e conecte o mesmo repositรณrio novamente (o Railway permite mรบltiplos serviรงos para o mesmo repositรณrio).
2. Na aba **Settings โ†’ Start Command** do novo serviรงo, defina:
```
streamlit run dashboard.py --server.port $PORT --server.address 0.0.0.0
```
3. Na aba **Variables** deste serviรงo, adicione apenas:
- `DATABASE_URL` (a mesma URL interna do plugin PostgreSQL โ€” os dois serviรงos a compartilham)
4. Defina um domรญnio customizado ou use a URL gerada pelo Railway para acessar o dashboard.
5. O dashboard se conecta diretamente ร  mesma instรขncia PostgreSQL onde o bot escreve os dados.

---

### ๐Ÿ—‚๏ธ Estrutura do Projeto

```text
personal_finance_ai/
โ”œโ”€โ”€ bot.py # Handlers, Mรกquina de Estados, worker de fila, IA e servidor FastAPI
โ”œโ”€โ”€ database.py # Funรงรตes de BD, connection pool, queries complexas e criaรงรฃo de tabelas
โ”œโ”€โ”€ dashboard.py # Dashboard BI no Streamlit (5 abas analรญticas)
โ”œโ”€โ”€ prompts.py # Prompts da IA (Extraรงรฃo e Enriquecimento)
โ”œโ”€โ”€ Procfile # Serviรงo bot do Railway: "web: python bot.py"
โ”œโ”€โ”€ docker-compose.yml # Banco PostgreSQL local
โ”œโ”€โ”€ requirements.txt # Dependรชncias (inclui streamlit, plotly, fastapi)
โ”œโ”€โ”€ sync_db.ps1 # Script PowerShell para clonar a base de prod para o ambiente local
โ”œโ”€โ”€ .python-version # Forรงa o Python 3.12 no Nixpacks do Railway
โ”œโ”€โ”€ ARCHITECTURE.md # Especificaรงรฃo tรฉcnica completa do projeto
โ”œโ”€โ”€ BACKLOG.md # Backlog de produto e roadmap
โ””โ”€โ”€ .env # Variรกveis secretas (ignorado pelo git)
```

---

### ๐Ÿšฆ Commits Convencionais

| Prefixo | Uso |
|---------|---------|
| `feat:` | Nova funcionalidade | `fix:` | Correรงรฃo de bug |
| `refactor:` | Mudanรงa sem impacto visual/funcional | `docs:` | Documentaรงรฃo |
| `chore:` | Build, pacotes ou configuraรงรฃo | | |

---

### ๐Ÿ—บ๏ธ Roadmap de Desenvolvimento

#### โœ… V1 โ€” Fundaรงรฃo de Produรงรฃo
Ingestรฃo central, Outbox + Backoff, NFC-e + PDF, motor de parcelamento, connection pool, whitelist de seguranรงa, colunas em formato DATE.

#### โœ… V2 โ€” Motor Contรกbil e UX
- Dashboard AP/AR com menu acordeon e pagamento em massa de faturas.
- `/extrato` rodando 100% em regime de caixa, com carteira benefรญcio isolada e รญndice de parcelas (`8/10`).
- Sobrescrita de mรฉtodo de pagamento no momento da baixa.
- Antecipaรงรฃo de cartรฃo de crรฉdito (move a parcela para o fechamento da fatura seguinte, mas mantรฉm PENDENTE).
- Arquitetura FastAPI webhook. Menu `/help` interativo.
- Detecรงรฃo algorรญtmica de desconto oculto. Regras de desambiguaรงรฃo de IA.

#### โœ… V3 โ€” Escala e Visualizaรงรฃo
- Dashboard Streamlit BI no Railway (segundo serviรงo, mesmo PostgreSQL).
- 5 abas analรญticas: Saรบde do Mรชs, Tendรชncias, Cartรตes & Parcelas, Projeรงรฃo de Caixa, Operacional.
- KPIs em regime de caixa absoluto (`paid_amount` vs `expected_amount`) e mรฉtrica de taxa de poupanรงa.
- Isolamento da carteira de benefรญcio no Streamlit (mesma lรณgica do `/extrato`).
- Anรกlise de itens (regime de competรชncia): treemap hierรกrquico, sunburst, frequรชncia vs ticket, heatmap de dias.
- Grรกfico de curva de descontos acumulados e antecipaรงรตes.
- Gauge de comprometimento de renda com slider de horizonte futuro.
- Filtro de locais em formato Blacklist (inicia vazio, ocultando apenas os itens selecionados).
- Refatoraรงรฃo dos prompts para o arquivo central `prompts.py`.

#### ๐Ÿšง V4 โ€” Hardening e Inteligรชncia
- [ ] Substituir `print()` pelo mรณdulo `logging` com nรญveis estruturados.
- [ ] Suporte a multi-transaรงรฃo (vรกrias compras na mesma resposta do LLM).
- [ ] Quebra de senha de PDFs de operadoras de celular durante a conversa.
- [ ] Substituir `psycopg2` por `asyncpg` (chamadas nรฃo-bloqueantes no event loop do FastAPI).
- [ ] Metas de orรงamento por categoria (armazenadas no banco e geridas pelo painel).