https://github.com/guduchango/api-rag-chat
complete, cloud-native API that uses a Retrieval-Augmented Generation (RAG) architecture to answer user questions based on a product catalog. It's built with a modern Python stack and deployed entirely on Google Cloud Platform using Infrastructure as Code.
https://github.com/guduchango/api-rag-chat
docker embeddings github-actions google-cloud langchain postgresql python3 rag terraform vertex-ai workflow
Last synced: 7 months ago
JSON representation
complete, cloud-native API that uses a Retrieval-Augmented Generation (RAG) architecture to answer user questions based on a product catalog. It's built with a modern Python stack and deployed entirely on Google Cloud Platform using Infrastructure as Code.
- Host: GitHub
- URL: https://github.com/guduchango/api-rag-chat
- Owner: guduchango
- Created: 2025-08-22T01:53:12.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-08-25T21:07:25.000Z (7 months ago)
- Last Synced: 2025-08-25T22:31:14.413Z (7 months ago)
- Topics: docker, embeddings, github-actions, google-cloud, langchain, postgresql, python3, rag, terraform, vertex-ai, workflow
- Language: Python
- Homepage:
- Size: 152 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 13
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
# Conversational RAG API for Product Catalogs
This project is a complete, cloud-native API that uses a **Retrieval-Augmented Generation (RAG)** architecture to answer user questions based on a product catalog. It's built with a modern Python stack and deployed entirely on Google Cloud Platform using Infrastructure as Code.
This API can:
- Ingest product data from a CSV file.
- Perform semantic searches to find relevant products.
- Maintain conversational memory for follow-up questions.
- Differentiate between product searches and simple chitchat.
## ✨ Features
- **FastAPI Backend:** A robust and modern API framework.
- **Modern Dependency Management:** Project dependencies and virtual environments are managed with **Poetry**.
- **Automated Code Quality:** Code formatting and linting are enforced automatically before each commit using **pre-commit** with `black` and `ruff`.
- **Cloud-Native Stack:** Runs on Google Cloud Run and connects to a managed Cloud SQL database.
- **Vector Database:** Uses PostgreSQL with the `pg_vector` extension for efficient similarity searches.
- **Cloud AI Models:** Leverages Google's Vertex AI for state-of-the-art text embeddings.
- **Conversational Memory:** Remembers the last 3 turns of a conversation for each user session.
- **Infrastructure as Code (IaC):** The entire cloud infrastructure is defined and managed with **Terraform**.
- **Containerized Development:** A full local development environment is orchestrated with **Docker Compose**.
## 🏛️ Architecture
The application follows a decoupled, service-based architecture:
1. **Orchestrator (FastAPI on Cloud Run):** A serverless container that hosts the Python application logic. It handles user requests, manages conversation state, and orchestrates calls to other services.
2. **Vector Store (Cloud SQL for PostgreSQL):** A managed database instance where product information and its corresponding vector embeddings are stored using the `pg_vector` extension.
3. **Embedding Service (Vertex AI):** The `textembedding-gecko@003` model is called via its API to convert product descriptions and user questions into vector embeddings.
4. **Secure Configuration:** Database passwords and other secrets are managed securely using **Google Secret Manager**.
## 🚀 Getting Started
### Prerequisites
- [Docker & Docker Compose](https://www.docker.com/products/docker-desktop/)
- [Poetry](https://python-poetry.org/docs/#installation)
- Python >=3.12
- [Terraform CLI](https://developer.hashicorp.com/terraform/tutorials/gcp-get-started/install-cli)
- [Google Cloud SDK (gcloud)](https://cloud.google.com/sdk/docs/install)
- A Google Cloud Project with billing enabled.
### Local Setup
#### Method 1: Running with Docker (Recommended)
This is the easiest way to get started, as it handles all dependencies within a container.
1. **Clone the repository:**
```bash
git clone
cd
```
2. **Configure GCP Credentials:**
- Create a Service Account key as a JSON file and save it in the project root as `gcp-credentials.json`.
- **Important:** Make sure `gcp-credentials.json` and `.env` are listed in your `.gitignore` file to avoid committing secrets.
3. **Create `.env` file:**
Copy the `env.example` file to a new file named `.env` and fill in the values for your local and GCP setup.
4. **Run the application:**
Use Docker Compose to build and run the entire local stack (API + Database). The command now uses `Dockerfile.dev` which automatically installs dependencies with Poetry.
```bash
docker-compose up --build
```
The API will be available at `http://127.0.0.1:8000/docs`.
#### Method 2: Running Locally with Poetry (Without Docker)
Use this method if you prefer to run the Python application directly on your machine.
1. **Complete Steps 1-3** from the Docker method above (Clone, Credentials, `.env`).
2. **Install Dependencies:**
Navigate to the project root and let Poetry install the required dependencies and create a virtual environment.
```bash
poetry install
```
3. **Activate the Virtual Environment:**
Run your commands inside the virtual environment managed by Poetry.
```bash
poetry shell
```
4. **Install pre-commit hooks:**
Activate the automated code quality checks for your local repository.
```bash
pre-commit install
```
5. **Run the API:**
Start the Uvicorn server.
```bash
uvicorn src.main:app --reload
```
### ☁️ Deployment to Google Cloud
The deployment process uses the production-optimized `Dockerfile`.
1. **Configure Terraform:**
- Create a file named `terraform.tfvars` and add your GCP project and billing information.
2. **Deploy Infrastructure:**
```bash
terraform init
terraform apply
```
3. **Build, Push, and Deploy the Application:**
- Authenticate Docker with GCP:
```bash
gcloud auth configure-docker us-central1-docker.pkg.dev
```
- Build, push, and deploy using the provided names for your project, repository, and service.
```bash
# Build
docker build -t us-central1-docker.pkg.dev/YOUR_PROJECT_ID/api-rag-repo/rag-api:v1 .
# Push
docker push us-central1-docker.pkg.dev/YOUR_PROJECT_ID/api-rag-repo/rag-api:v1
# Deploy
gcloud run deploy api-rag-service \
--image=us-central1-docker.pkg.dev/YOUR_PROJECT_ID/api-rag-repo/rag-api:v1 \
--region=us-central1
```
## API Usage
The API provides interactive documentation via Swagger UI. Once the service is running (locally or in the cloud), navigate to `/docs`.
### Example: Upload CSV
```bash
curl -X POST "[http://127.0.0.1:8000/api/upload-csv](http://127.0.0.1:8000/api/upload-csv)" \
-H "accept: application/json" \
-H "Content-Type: multipart/form-data" \
-F "file=@./path/to/your/flipkart_ecommerce_sample.csv;type=text/csv"
```
### Example: Generate Prompt
```bash
curl -X POST "[http://127.0.0.1:8000/api/generate-prompt?k=2](http://127.0.0.1:8000/api/generate-prompt?k=2)" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-d '{
"session_id": "user@example.com",
"question": "shampoo for dogs"
}'
```