https://github.com/devroopsaha744/agatha
Agatha is a NL2SQL-driven expense tracker API that leverages Llama 3.3 provided by Groq for efficient and fast data extraction and NL2SQL workflow. It allows users to either add transaction data to a SQL database or ask questions about their expenses, all via natural language.
https://github.com/devroopsaha744/agatha
docker expense-tracker fastapi generative-ai groq-api langchain llama3-3 llm nlp python text2sql transaction-management
Last synced: about 2 months ago
JSON representation
Agatha is a NL2SQL-driven expense tracker API that leverages Llama 3.3 provided by Groq for efficient and fast data extraction and NL2SQL workflow. It allows users to either add transaction data to a SQL database or ask questions about their expenses, all via natural language.
- Host: GitHub
- URL: https://github.com/devroopsaha744/agatha
- Owner: devroopsaha744
- License: mit
- Created: 2025-02-12T14:21:35.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-02-16T11:45:30.000Z (8 months ago)
- Last Synced: 2025-06-19T04:03:50.762Z (4 months ago)
- Topics: docker, expense-tracker, fastapi, generative-ai, groq-api, langchain, llama3-3, llm, nlp, python, text2sql, transaction-management
- Language: Jupyter Notebook
- Homepage:
- Size: 577 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Agatha: Natural Language Based Expense Tracker API
Agatha is a NL2SQL-driven expense tracker API that uses Llama 3.3 provided by Groq for efficient and fast data extraction and NL2SQL workflow. It allows users to either add transaction data to a SQL database or ask questions about their expenses, all via natural language. Agatha automatically determines whether an input is intended for ingestion (adding a transaction) or QnA (querying transactions) to improve the user experience.

---
## Table of Contents
- [Overview](#overview)
- [Features](#features)
- [API Workflow](#api-workflow)
- [Privacy & Security Considerations](#privacy--security-considerations)
- [Endpoints](#endpoints)
- [Health Check](#health-check)
- [Database Status](#database-status)
- [Classify Input](#classify-input)
- [Ingest Transaction](#ingest-transaction)
- [Ask a Question](#ask-a-question)
- [Process Query](#process-query)
- [Installation](#installation)
- [Local Setup](#local-setup)
- [Using Docker](#using-docker)
- [Example Usage with Postman](#example-usage-with-postman)
- [Dependencies](#dependencies)
- [License](#license)---
## Overview
Agatha uses three core pipelines to handle user inputs:
1. **Ingestion Pipeline:**
Extracts transaction details (date, amount, vendor, description) from natural language input and saves them into an SQLite database.2. **QnA Pipeline:**
Converts natural language questions into SQL queries (via a text-to-SQL chain), executes these queries against the transactions database, and returns answers based on the retrieved data.3. **Classification Pipeline:**
Automatically classifies a natural language query as either "ingestion" or "chat". Based on the classification, the API routes the query to the appropriate pipeline without requiring the user to specify the intent explicitly.---
## Features
- **Natural Language Ingestion:**
Easily add expenses by describing them in plain language.- **Natural Language Querying:**
Ask questions about your spending (e.g., "What is my total spending?") and get answers generated by converting your question into a SQL query.- **Automatic Query Classification:**
The API intelligently distinguishes between ingestion and query requests to provide a seamless experience.- **SQLite Database Integration:**
All transactions are stored in an SQLite database that is automatically created and maintained.---
## API Workflow
1. **User Input:**
A natural language input is sent to the API.2. **Classification:**
The input is first classified as either for ingestion or QnA using a classification pipeline. This decision is made based on the semantics of the input.3. **Routing:**
- If the input is for ingestion, Agatha extracts the transaction details (using an extraction prompt) and saves them into the SQLite database.
- If the input is a question, Agatha converts the question to a SQL query (using a text-to-SQL chain), executes it on the database, and then rephrases the answer before returning it.4. **Response:**
Agatha returns a JSON response containing either the ingested transaction details or the answer to the query.---
## Privacy & Security Considerations
- **Local Data Storage:**
Agatha is designed as a self-hosted solution. When deployed, all sensitive transaction data is stored locally in an SQLite database, ensuring that your personal financial information remains private and under your control.- **User-Controlled Deployment:**
Since users deploy Agatha on their own servers or machines (via Docker or a local setup), you are fully responsible for securing your environment. No data is sent to external servers by default.- **Secure Handling of API Keys:**
API keys and other sensitive environment variables should be managed securely. We recommend using environment variables, `.env` files (which should not be committed to source control), or Docker secrets for production deployments.- **Best Practices for Public Deployment:**
If you choose to deploy Agatha publicly, implement proper authentication, use HTTPS to secure communications, and follow best practices for securing your database (such as using parameterized queries to prevent SQL injection).- **Data Privacy:**
Ensure that only authorized users have access to Agatha, and consider additional layers of encryption or access control if handling highly sensitive financial data.---
## Endpoints
### Health Check
- **URL:** `/health`
- **Method:** `GET`
- **Description:** Returns the health status of the API.
- **Example Response:**
```json
{
"status": "healthy",
"version": "1.0.0"
}
```### Database Status
- **URL:** `/db_status`
- **Method:** `GET`
- **Description:** Provides the current status of the SQLite database, including the names of tables and the table count.
- **Example Response:**
```json
{
"database": "transactions.db",
"tables": ["transactions"],
"table_count": 1
}
```### Classify Input
- **URL:** `/classify`
- **Method:** `POST`
- **Description:** Classifies a given natural language input as either "ingestion" or "chat".
- **Request Body:**
```json
{
"text": "I spent 500 INR at Amazon on electronics."
}
```
- **Example Response:**
```json
{
"input_type": "ingestion"
}
```### Ingest Transaction
- **URL:** `/ingest`
- **Method:** `POST`
- **Description:** Extracts transaction details from natural language input and stores the transaction in the database.
- **Request Body:**
```json
{
"text": "I bought groceries from Walmart for $50 on January 12th."
}
```
- **Example Response:**
```json
{
"status": "success",
"transaction": {
"date": "2024-01-12",
"amount": 50.0,
"vendor": "Walmart",
"description": "groceries"
}
}
```### Ask a Question
- **URL:** `/ask`
- **Method:** `POST`
- **Description:** Converts a natural language question into a SQL query, executes it, and returns the answer.
- **Request Body:**
```json
{
"question": "What is my total spending?"
}
```
- **Example Response:**
```json
{
"question": "What is my total spending?",
"answer": "Your total spending is $500."
}
```### Process Query
- **URL:** `/process`
- **Method:** `POST`
- **Description:** Automatically classifies the natural language input as either ingestion or QnA and routes it to the appropriate pipeline.
- **Request Body:**
```json
{
"text": "How much did I spend at Starbucks last month?"
}
```
- **Example Response (if classified as chat):**
```json
{
"input_type": "chat",
"question": "How much did I spend at Starbucks last month?",
"answer": "$30 at Starbucks last month."
}
```
- **Example Response (if classified as ingestion):**
```json
{
"input_type": "ingestion",
"transaction": {
"date": "2024-02-15",
"amount": 30.0,
"vendor": "Starbucks",
"description": "coffee and snacks"
}
}
```---
## Installation
### Local Setup
1. **Clone the Repository:**
```bash
git clone
cd agatha
```2. **Create a Virtual Environment:**
```bash
python3 -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`
```3. **Install Dependencies:**
```bash
pip install -r requirements.txt
```4. **Environment Variables:**
Create a `.env` file in the project root and set your environment variables:
```dotenv
GROQ_API_KEY=your_groq_api_key
```5. **Run the API:**
```bash
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
```6. **Access the API:**
Open your browser or API client at [http://localhost:8000](http://localhost:8000).### Using Docker
1. **Build the Docker Image:**
```bash
docker build -t agatha .
```2. **Run the Docker Container:**
```bash
docker run -p 8000:8000 agatha
```3. **Access the API:**
The API will be available at [http://localhost:8000](http://localhost:8000).*Note:* For secure deployment, ensure you pass environment variables (or secrets) to your container. For example:
```bash
docker run -p 8000:8000 --env-file .env agatha
```---
## Example Usage with Postman
1. **Start the API** (locally or via Docker).
2. **Open Postman** and create a new request.
3. **Set Request Type & URL:**
- For example, to test the health check:
- **Method:** GET
- **URL:** `http://localhost:8000/health`4. **For POST Endpoints:**
- Select the **Body** tab, choose **raw**, and set the type to **JSON**.
- Input the JSON payload as described in the endpoint documentation above.5. **Send Request:**
- Click **Send** and verify the JSON response matches the expected output.---
## Dependencies
The project requires the following Python packages (as listed in `requirements.txt`):
```txt
fastapi
uvicorn
pydanti
python-dotenv
langchain
langchain-groq
langchain-community
langchain-core
```*Note:* Replace `` with the appropriate version numbers for your `langchain` related packages.
---
## License
This project is licensed under the [MIT License](LICENSE).
---
## Summary
Agatha is a fully automated, natural language-based expense tracker API. It seamlessly handles both transaction ingestion and querying through intelligent classification, data extraction, and a text-to-SQL pipeline. Whether you're adding your expenses or querying your spending habits, Agatha streamlines the process to provide quick, accurate responses while keeping your sensitive financial data private.
**Privacy Reminder:** Agatha is intended for self-hosted use. When deployed on your own secure server or local machine, you remain in control of your data. For additional security, always use proper authentication, HTTPS, and environment management practices when handling sensitive information.
For any questions or issues, please open an issue on the repository.