{"id":13456008,"url":"https://github.com/danny-avila/rag_api","last_synced_at":"2025-05-15T10:07:49.373Z","repository":{"id":228689533,"uuid":"773345074","full_name":"danny-avila/rag_api","owner":"danny-avila","description":"ID-based RAG FastAPI: Integration with Langchain and PostgreSQL/pgvector","archived":false,"fork":false,"pushed_at":"2025-05-08T18:02:54.000Z","size":29610,"stargazers_count":500,"open_issues_count":39,"forks_count":200,"subscribers_count":10,"default_branch":"main","last_synced_at":"2025-05-11T00:32:22.187Z","etag":null,"topics":["api","api-rest","embeddings","fastapi","langchain","pgvector","postgresql","psql","python","rag","vector","vector-database"],"latest_commit_sha":null,"homepage":"https://librechat.ai/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/danny-avila.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-03-17T12:20:35.000Z","updated_at":"2025-05-10T17:30:48.000Z","dependencies_parsed_at":"2024-04-09T21:37:50.742Z","dependency_job_id":"8857c33e-1b12-4c2e-b6ce-177671e26777","html_url":"https://github.com/danny-avila/rag_api","commit_stats":{"total_commits":57,"total_committers":15,"mean_commits":3.8,"dds":"0.38596491228070173","last_synced_commit":"36ce865fd5e338200c3d8a7decb238c36f9e479f"},"previous_names":["danny-avila/rag_api"],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danny-avila%2Frag_api","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danny-avila%2Frag_api/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danny-avila%2Frag_api/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danny-avila%2Frag_api/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/danny-avila","download_url":"https://codeload.github.com/danny-avila/rag_api/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254303469,"owners_count":22048207,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","api-rest","embeddings","fastapi","langchain","pgvector","postgresql","psql","python","rag","vector","vector-database"],"created_at":"2024-07-31T08:01:14.750Z","updated_at":"2025-05-15T10:07:44.357Z","avatar_url":"https://github.com/danny-avila.png","language":"Python","funding_links":[],"categories":["Model Serving \u0026 Inference","Python","*Ops for AI"],"sub_categories":["Vector Databases \u0026 Retrieval Infrastructure","Model Serving \u0026 Inference"],"readme":"﻿# ID-based RAG FastAPI\r\n\r\n## Overview\r\nThis project integrates Langchain with FastAPI in an Asynchronous, Scalable manner, providing a framework for document indexing and retrieval, using PostgreSQL/pgvector.\r\n\r\nFiles are organized into embeddings by `file_id`. The primary use case is for integration with [LibreChat](https://librechat.ai), but this simple API can be used for any ID-based use case.\r\n\r\nThe main reason to use the ID approach is to work with embeddings on a file-level. This makes for targeted queries when combined with file metadata stored in a database, such as is done by LibreChat.\r\n\r\nThe API will evolve over time to employ different querying/re-ranking methods, embedding models, and vector stores.\r\n\r\n## Features\r\n- **Document Management**: Methods for adding, retrieving, and deleting documents.\r\n- **Vector Store**: Utilizes Langchain's vector store for efficient document retrieval.\r\n- **Asynchronous Support**: Offers async operations for enhanced performance.\r\n\r\n## Setup\r\n\r\n### Getting Started\r\n\r\n- **Configure `.env` file based on [section below](#environment-variables)**\r\n- **Setup pgvector database:**\r\n  - Run an existing PSQL/PGVector setup, or,\r\n  - Docker: `docker compose up` (also starts RAG API)\r\n    - or, use docker just for DB: `docker compose -f ./db-compose.yaml up`\r\n- **Run API**:\r\n  - Docker: `docker compose up` (also starts PSQL/pgvector)\r\n    - or, use docker just for RAG API: `docker compose -f ./api-compose.yaml up`\r\n  - Local:\r\n    - Make sure to setup `DB_HOST` to the correct database hostname\r\n    - Run the following commands (preferably in a [virtual environment](https://realpython.com/python-virtual-environments-a-primer/))\r\n```bash\r\npip install -r requirements.txt\r\nuvicorn main:app\r\n```\r\n\r\n### Environment Variables\r\n\r\nThe following environment variables are required to run the application:\r\n\r\n- `RAG_OPENAI_API_KEY`: The API key for OpenAI API Embeddings (if using default settings).\r\n    - Note: `OPENAI_API_KEY` will work but `RAG_OPENAI_API_KEY` will override it in order to not conflict with LibreChat setting.\r\n- `RAG_OPENAI_BASEURL`: (Optional) The base URL for your OpenAI API Embeddings\r\n- `RAG_OPENAI_PROXY`: (Optional) Proxy for OpenAI API Embeddings\r\n- `VECTOR_DB_TYPE`: (Optional) select vector database type, default to `pgvector`.\r\n- `POSTGRES_DB`: (Optional) The name of the PostgreSQL database, used when `VECTOR_DB_TYPE=pgvector`.\r\n- `POSTGRES_USER`: (Optional) The username for connecting to the PostgreSQL database.\r\n- `POSTGRES_PASSWORD`: (Optional) The password for connecting to the PostgreSQL database.\r\n- `DB_HOST`: (Optional) The hostname or IP address of the PostgreSQL database server.\r\n- `DB_PORT`: (Optional) The port number of the PostgreSQL database server.\r\n- `RAG_HOST`: (Optional) The hostname or IP address where the API server will run. Defaults to \"0.0.0.0\"\r\n- `RAG_PORT`: (Optional) The port number where the API server will run. Defaults to port 8000.\r\n- `JWT_SECRET`: (Optional) The secret key used for verifying JWT tokens for requests.\r\n  - The secret is only used for verification. This basic approach assumes a signed JWT from elsewhere.\r\n  - Omit to run API without requiring authentication\r\n\r\n- `COLLECTION_NAME`: (Optional) The name of the collection in the vector store. Default value is \"testcollection\".\r\n- `CHUNK_SIZE`: (Optional) The size of the chunks for text processing. Default value is \"1500\".\r\n- `CHUNK_OVERLAP`: (Optional) The overlap between chunks during text processing. Default value is \"100\".\r\n- `RAG_UPLOAD_DIR`: (Optional) The directory where uploaded files are stored. Default value is \"./uploads/\".\r\n- `PDF_EXTRACT_IMAGES`: (Optional) A boolean value indicating whether to extract images from PDF files. Default value is \"False\".\r\n- `DEBUG_RAG_API`: (Optional) Set to \"True\" to show more verbose logging output in the server console, and to enable postgresql database routes\r\n- `CONSOLE_JSON`: (Optional) Set to \"True\" to log as json for Cloud Logging aggregations\r\n- `EMBEDDINGS_PROVIDER`: (Optional) either \"openai\", \"bedrock\", \"azure\", \"huggingface\", \"huggingfacetei\" or \"ollama\", where \"huggingface\" uses sentence_transformers; defaults to \"openai\"\r\n- `EMBEDDINGS_MODEL`: (Optional) Set a valid embeddings model to use from the configured provider.\r\n    - **Defaults**\r\n    - openai: \"text-embedding-3-small\"\r\n    - azure: \"text-embedding-3-small\" (will be used as your Azure Deployment)\r\n    - huggingface: \"sentence-transformers/all-MiniLM-L6-v2\"\r\n    - huggingfacetei: \"http://huggingfacetei:3000\". Hugging Face TEI uses model defined on TEI service launch.\r\n    - ollama: \"nomic-embed-text\"\r\n    - bedrock: \"amazon.titan-embed-text-v1\"\r\n- `RAG_AZURE_OPENAI_API_VERSION`: (Optional) Default is `2023-05-15`. The version of the Azure OpenAI API.\r\n- `RAG_AZURE_OPENAI_API_KEY`: (Optional) The API key for Azure OpenAI service.\r\n    - Note: `AZURE_OPENAI_API_KEY` will work but `RAG_AZURE_OPENAI_API_KEY` will override it in order to not conflict with LibreChat setting.\r\n- `RAG_AZURE_OPENAI_ENDPOINT`: (Optional) The endpoint URL for Azure OpenAI service, including the resource.\r\n    - Example: `https://YOUR_RESOURCE_NAME.openai.azure.com`.\r\n    - Note: `AZURE_OPENAI_ENDPOINT` will work but `RAG_AZURE_OPENAI_ENDPOINT` will override it in order to not conflict with LibreChat setting.\r\n- `HF_TOKEN`: (Optional) if needed for `huggingface` option.\r\n- `OLLAMA_BASE_URL`: (Optional) defaults to `http://ollama:11434`.\r\n- `ATLAS_SEARCH_INDEX`: (Optional) the name of the vector search index if using Atlas MongoDB, defaults to `vector_index`\r\n- `MONGO_VECTOR_COLLECTION`: Deprecated for MongoDB, please use `ATLAS_SEARCH_INDEX` and `COLLECTION_NAME`\r\n- `AWS_DEFAULT_REGION`: (Optional) defaults to `us-east-1`\r\n- `AWS_ACCESS_KEY_ID`: (Optional) needed for bedrock embeddings\r\n- `AWS_SECRET_ACCESS_KEY`: (Optional) needed for bedrock embeddings\r\n\r\nMake sure to set these environment variables before running the application. You can set them in a `.env` file or as system environment variables.\r\n\r\n### Use Atlas MongoDB as Vector Database\r\n\r\nInstead of using the default pgvector, we could use [Atlas MongoDB](https://www.mongodb.com/products/platform/atlas-vector-search) as the vector database. To do so, set the following environment variables\r\n\r\n```env\r\nVECTOR_DB_TYPE=atlas-mongo\r\nATLAS_MONGO_DB_URI=\u003cmongodb+srv://...\u003e\r\nCOLLECTION_NAME=\u003cvector collection\u003e\r\nATLAS_SEARCH_INDEX=\u003cvector search index\u003e\r\n```\r\n\r\nThe `ATLAS_MONGO_DB_URI` could be the same or different from what is used by LibreChat. Even if it is the same, the `$COLLECTION_NAME` collection needs to be a completely new one, separate from all collections used by LibreChat. In addition,  create a vector search index for collection above (remember to assign `$ATLAS_SEARCH_INDEX`) with the following json:\r\n\r\n```json\r\n{\r\n  \"fields\": [\r\n    {\r\n      \"numDimensions\": 1536,\r\n      \"path\": \"embedding\",\r\n      \"similarity\": \"cosine\",\r\n      \"type\": \"vector\"\r\n    },\r\n    {\r\n      \"path\": \"file_id\",\r\n      \"type\": \"filter\"\r\n    }\r\n  ]\r\n}\r\n```\r\n\r\nFollow one of the [four documented methods](https://www.mongodb.com/docs/atlas/atlas-vector-search/create-index/#procedure) to create the vector index.\r\n\r\n\r\n### Cloud Installation Settings:\r\n\r\n#### AWS:\r\nMake sure your RDS Postgres instance adheres to this requirement:\r\n\r\n`The pgvector extension version 0.5.0 is available on database instances in Amazon RDS running PostgreSQL 15.4-R2 and higher, 14.9-R2 and higher, 13.12-R2 and higher, and 12.16-R2 and higher in all applicable AWS Regions, including the AWS GovCloud (US) Regions.`\r\n\r\nIn order to setup RDS Postgres with RAG API, you can follow these steps:\r\n\r\n* Create a RDS Instance/Cluster using the provided [AWS Documentation](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_CreateDBInstance.html).\r\n* Login to the RDS Cluster using the Endpoint connection string from the RDS Console or from your IaC Solution output.\r\n* The login is via the *Master User*.\r\n* Create a dedicated database for rag_api:\r\n``` create database rag_api;```.\r\n* Create a dedicated user\\role for that database:\r\n``` create role rag;```\r\n\r\n* Switch to the database you just created: ```\\c rag_api```\r\n* Enable the Vector extension: ```create extension vector;```\r\n* Use the documentation provided above to set up the connection string to the RDS Postgres Instance\\Cluster.\r\n\r\nNotes:\r\n  * Even though you're logging with a Master user, it doesn't have all the super user privileges, that's why we cannot use the command: ```create role x with superuser;```\r\n  * If you do not enable the extension, rag_api service will throw an error that it cannot create the extension due to the note above.\r\n\r\n### Dev notes:\r\n\r\n#### Installing pre-commit formatter\r\n\r\nRun the following commands to install pre-commit formatter, which uses [black](https://github.com/psf/black) code formatter:\r\n\r\n```bash\r\npip install pre-commit\r\npre-commit install\r\n```\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdanny-avila%2Frag_api","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdanny-avila%2Frag_api","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdanny-avila%2Frag_api/lists"}