{"id":25497464,"url":"https://github.com/perpetue237/rag-api-template","last_synced_at":"2026-03-07T11:03:09.469Z","repository":{"id":247905546,"uuid":"819268511","full_name":"Perpetue237/rag-api-template","owner":"Perpetue237","description":"A template to set up, build and deploy a RAG-API including frontend and backend with docker compose.","archived":false,"fork":false,"pushed_at":"2025-07-20T19:26:40.000Z","size":3043,"stargazers_count":8,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-01-28T07:43:22.938Z","etag":null,"topics":["backend","deployment","docker","docker-compose","frontend","gpu-acceleration","langchain","llm","python","rag"],"latest_commit_sha":null,"homepage":"https://www.youtube.com/@DebuggingwithKTiPs","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Perpetue237.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-24T07:11:16.000Z","updated_at":"2025-07-20T19:26:44.000Z","dependencies_parsed_at":"2024-07-17T22:55:58.983Z","dependency_job_id":"57645d65-0f3d-4326-a59a-a587eeee117a","html_url":"https://github.com/Perpetue237/rag-api-template","commit_stats":null,"previous_names":["perpetue237/rag-api-template"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Perpetue237/rag-api-template","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Perpetue237%2Frag-api-template","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Perpetue237%2Frag-api-template/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Perpetue237%2Frag-api-template/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Perpetue237%2Frag-api-template/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Perpetue237","download_url":"https://codeload.github.com/Perpetue237/rag-api-template/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Perpetue237%2Frag-api-template/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30212103,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-07T09:02:10.694Z","status":"ssl_error","status_checked_at":"2026-03-07T09:02:08.429Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["backend","deployment","docker","docker-compose","frontend","gpu-acceleration","langchain","llm","python","rag"],"created_at":"2025-02-19T01:20:04.804Z","updated_at":"2026-03-07T11:03:09.453Z","avatar_url":"https://github.com/Perpetue237.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Retrieval-Augmented Generation (RAG) API Template with Frontend and Backend\nThis repository proposes a template to set up and build a GPU-accelerated RAG-API FastAPI and HuggingFace Transformers.\n\n- [Retrieval-Augmented Generation (RAG) API Template with Frontend and Backend](#retrieval-augmented-generation-rag-api-template-with-frontend-and-backend)\n  - [Project Structure](#project-structure)\n  - [Features](#features)\n  - [Requirements](#requirements)\n  - [Dependencies (tested on UBUNTU 22.04)](#dependencies-tested-on-ubuntu-2204)\n  - [Installation](#installation)\n    - [1. Clone the repo:](#1-clone-the-repo)\n    - [2. Navigate to the project directory:](#2-navigate-to-the-project-directory)\n    - [3. Sart the API:](#3-sart-the-api)\n    - [4. Visit the API](#4-visit-the-api)\n    - [5. Stop the Application and Clean the System](#5-stop-the-application-and-clean-the-system)\n  - [Deployment](#deployment)\n    - [Services with Docker Compose](#services-with-docker-compose)\n    - [Nginx Configuration Overview](#nginx-configuration-overview)\n  - [Contributions](#contributions)\n  - [License](#license)\n  - [Contact](#contact)\n\n## Project Structure\n\n```plaintext\n|-- LICENSE\n|-- README.md\n|-- backend\n|   |-- Dockerfile\n|   |-- app\n|   |-- requirements.txt\n|-- docker-compose.yml\n|-- frontend\n|   |-- Dockerfile\n|   |-- README.md\n|   |-- index.html\n|   |-- package-lock.json\n|   |-- package.json\n|   |-- public\n|   |-- src\n|   |-- tsconfig.app.json\n|   |-- tsconfig.json\n|   |-- tsconfig.node.json\n|   `-- vite.config.ts\n|-- install-docker.sh\n|-- install-nvidia-container-toolkit.sh\n|-- models_loader.ipynb\n`-- nginx\n    `-- nginx.conf\n```\n## Features\n\n- **File Upload:** Easily upload PDF files to the server.\n- **Document Loading:** Load and process documents using the langchain PyPDFLoader.\n- **Text Splitting:** Split documents into manageable chunks using the CharacterTextSplitter.\n- **Vector Store:** Create a FAISS vector store from document chunks for efficient retrieval.\n- **Embeddings:** Use HuggingFace embeddings to transform text data.\n- **Text Generation:** Generate answers to questions using a pre-trained language model.\n- **Asynchronous Streaming:** Stream responses asynchronously for efficient and responsive querying.\n- **Customizable Pipeline:** Easily customize the text generation pipeline with quantization and other settings.\n- **CORS Support:** Full CORS support for cross-origin requests.\n- **Logging:** Detailed logging for monitoring and debugging.\n- **Memory Management:** Efficient GPU memory management with garbage collection.\n\n## Requirements\n\n- Operating System: Windows, macOS, Linux\n- Minimum Disk Space: 10GB\n- Minimum Memory: 4GB RAM\n- GPU Characteristics: NVIDIA RTX A2000 GPU with 4096MiB total memory\n\n## Dependencies (tested on UBUNTU 22.04) \n\n- NVIDIA drivers\n- Visual Studio Code (optional, for development)\n- Jupyter Notebook (for downloading pre-trained models)\n\n## Installation\n\n### 1. Clone the repo:\n```sh\ngit clone https://github.com/Perpetue237/rag-api-template.git\n```\n\nAfter cloning the repository, follow these steps to set up the project:\n\n1. Install Docker and nvidia-container-toolkit.sh\n\n\u003e **Note:** The execution of this shell files reboot the notebook. To avoid this you may want to comment the corresponding lines out.\n\nNavigate to the project directory:\n```sh\ncd rag-api-template\n```\n\n- Docker Desktop:\n    ```sh\n    echo   \"deb [arch=\"$(dpkg --print-architecture)\" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu lunar stable\" | sudo tee /etc/apt/sources.list.d/docker.list \u003e /dev/null\n    sudo bash install-docker.sh\n    ```\n    Reboot the system after a succesfull installation. \n\n- NVIDIA Container Toolkit:\n    ```sh\n        sudo bash install-nvidia-container-toolkit.sh\n    ```\n\n\n2. Create Directories\nCreate directories to store the models, tokenizers, and data. You can create these directories anywhere on your file system. Here is an example of how to create them in the project's root directory:\n\n```sh\nmkdir -p ~/rag-template/models/models\nmkdir -p ~/rag-template/models/tokenizers\nmkdir -p ~/rag-template/rag-uploads\n```\n\n3. Update `docker-compose.yml`\nModify the [docker-compose.yml](`docker-compose.yml`) file to mount these directories:\n\n```yaml\nservices:\n    backend:\n        ...\n        volumes:\n        - /home/perpetue/rag-template/models/models:/app/models  # Mount the models directory\n        - /home/perpetue/rag-template/models/tokenizers:/app/tokenizers  # Mount the tokenizers directory\n        - /home/perpetue/rag-template/rag-uploads:/app/rag-uploads  # Mount the uploads directory\n        ...\n```\nReplace `/home/perpetue/rag-template` with the path where you created the directories.\n\n4. Update `.devcontainer/devcontainer.json`\nIf you are using VSCode for development, you need to mount these paths in the [.devcontainer/devcontainer.json](`devcontainer.json`) file:\n\n```json\n    ...\n    \"mounts\": [\n            \"source=/home/perpetue/rag-template/models/models,target=/app/models,type=bind,consistency=cached\",\n            \"source=/home/perpetue/rag-template/models/tokenizers,target=/app/tokenizers,type=bind,consistency=cached\",\n            \"source=/home/perpetue/rag-template/rag-uploads,target=/app/rag-uploads,type=bind,consistency=cached\"\n        ],\n    ...\n```\nReplace `/home/perpetue/rag-template` with the path where you created the directories.\n\n5. Download Pre-trained Models\nUse the [models_loader.ipynb](models_loader.ipynb) notebook to download the pre-trained models you want to use. Open the notebook in Jupyter Notebook or JupyterLab and follow the instructions to download the necessary models. You can put your huggingface token and openai keys in an `.env` file in the projekt root folder, according to the sample in [.env.sample](`.env.sample`). \n\n### 2. Navigate to the project directory:\n    ```sh\n    cd rag-api-template\n    ```\n### 3. Sart the API:\n    ```sh\n    docker compose down\n    docker volume prune\n    docker system prune\n    docker-compose up --build -d\n    ```\n### 4. Visit the API\nOnce the API is successfully built, you can visit it at [http://localhost/](http://localhost/). You should see the following frontend:\n\n\u003cdiv style=\"display: flex; justify-content: space-between; align-items: center;\"\u003e\n    \u003cimg src=\"homepage.png\" alt=\"Homepage\" style=\"height: 200px; width: 300px; margin-right: 5px; object-fit: cover;\"\u003e\n    \u003cimg src=\"screenwithpromt.png\" alt=\"Prompt\" style=\"height: 200px; width: 300px; object-fit: cover;\"\u003e\n\u003c/div\u003e\n\n### 5. Stop the Application and Clean the System\n\n```sh\ndocker-compose down\ndocker system prune\ndocker volume prune\n```\n\n## Deployment\n\nThis [Docker Compose file](docker-compose.yml) sets up a multi-container application with three services: `frontend`, `backend`, and `nginx`.\n\n### Services with Docker Compose\n\n- **Frontend Service**\n  - **Build Context**: `./frontend`\n  - **Port**: 80\n  - **Network**: `app-network`\n\n- **Backend Service**\n  - **Build Context**: `./backend`\n  - **Port**: 8000\n  - **Network**: `app-network`\n  - **Volumes**: Mounts local directories for models and uploads\n  - **GPU Configuration**: Configured to use NVIDIA GPUs\n\n- **Nginx Service**\n  - **Image**: `nginx:latest`\n  - **Port**: 8081\n  - **Configuration**: Uses a custom Nginx configuration file\n  - **Dependencies**: Depends on `frontend` and `backend`\n  - **Network**: `app-network`\n\n\nDocker Compose builds, configures, and runs the specified services in isolated containers. Services can communicate over the defined `app-network`, ensuring connectivity and proper resource allocation.\n\n### Nginx Configuration Overview\n\nThis [Nginx configuration file](nginx/nginx.conf) sets up a basic web server with proxying capabilities and custom error handling.\n\n- **Worker Processes**\n  - `worker_processes 1;` - Uses one worker process.\n\n- **Events**\n  - `worker_connections 1024;` - Allows up to 1024 connections per worker.\n\n- **HTTP Server**\n  - **Port**\n    - `listen 80;` - Listens on port 80 for HTTP requests.\n\n  - **Server Name**\n    - `server_name localhost;` - Uses `localhost` as the server name.\n\n  - **Root Path (`/`)**\n    - Serves static files from `/usr/share/nginx/html`.\n    - Defaults to `index.html` and falls back to it for single-page applications.\n\n  - **Proxy Endpoints**\n    - **/upload**\n      - Forwards requests to `http://backend:8000/upload`.\n      - Includes CORS headers for cross-origin requests.\n    - **/retrieve_from_path**\n      - Forwards requests to `http://backend:8000/retrieve_from_path`.\n      - Includes CORS headers.\n\n  - **Error Handling**\n    - `error_page 500 502 503 504 /50x.html;` - Custom error page for server errors.\n\n## Contributions\nContributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are **greatly appreciated**.\n\n1. Fork the Project\n2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)\n3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)\n4. Push to the Branch (`git push origin feature/AmazingFeature`)\n5. Open a Pull Request\n\n## License\n\nDistributed under the Apache License. See [LICENSE](LICENSE) for more information.\n\n## Contact\n\n[Perpetue Kuete Tiayo](https://www.linkedin.com/in/perpetue-k-375306185)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fperpetue237%2Frag-api-template","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fperpetue237%2Frag-api-template","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fperpetue237%2Frag-api-template/lists"}