Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/anasaber/rag_in_cpu
This is a an Advanced RAG system, where I tried to make it functioning in regular PC with a CPU using all free resources, using APIs and tools to make it happen.
https://github.com/anasaber/rag_in_cpu
advanced-rag chromadb cohere cohere-api generative-ai groq groq-api langchain llms rag rag-application rag-implementation rag-pipeline
Last synced: 27 days ago
JSON representation
This is a an Advanced RAG system, where I tried to make it functioning in regular PC with a CPU using all free resources, using APIs and tools to make it happen.
- Host: GitHub
- URL: https://github.com/anasaber/rag_in_cpu
- Owner: AnasAber
- Created: 2024-07-16T21:38:28.000Z (4 months ago)
- Default Branch: master
- Last Pushed: 2024-10-08T17:45:22.000Z (29 days ago)
- Last Synced: 2024-10-11T04:04:17.726Z (27 days ago)
- Topics: advanced-rag, chromadb, cohere, cohere-api, generative-ai, groq, groq-api, langchain, llms, rag, rag-application, rag-implementation, rag-pipeline
- Language: Python
- Homepage:
- Size: 7.84 MB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Advanced Retrieval-Augmented Generation (RAG) System
This project implements an Advanced RAG system designed to work on a regular PC using all free resources, leveraging various APIs and tools to achieve this.## Models Used
- Embedding Model: Utilized `nomic-embed-text-v1` using HuggingFace API.
- Reranker: Utilized `rerank-english-v2.0` using Cohere API.
- Language Model (LLM): Leveraged Groq API with `llama3-70b-8192`.## System Overview
![System Architecture Diagram](src/images/RAG_in_CPU.gif)
The RAG system consists of the following components:
#### Chunking and Embedding:
Text data is chunked into manageable pieces.
Each chunk is embedded using a model from HuggingFace.
Embeddings are stored in a vector database (ChromaDB).
#### Retrieval and Reranking:Relevant chunks are retrieved from ChromaDB based on the query.
Retrieved chunks are reranked using the Cohere API to ensure the most relevant chunks are prioritized.#### Response Generation:
The top-ranked chunks are passed to the Llama model (via Groq API) to generate a coherent and relevant response.
### How to start
1. Clone the repository
```bash
git clone https://github.com/AnasAber/RAG_in_CPU.git
```2. Install the dependencies
```bash
pip install -r requirements.txt
```
3. Set up the setup.py file
```bash
py setup.py install
```
4. Set up the environment variables
```bash
export GROQ_API_KEY="your_groq_key"
export COHERE_API_KEY="your_cohere_key"
export HUGGINGFACE_API_KEY="your_huggingFace_key"
```
5. Run the `app.py` file
```bash
python app.py
```The reason why I'm using a virtual environment is to avoid any conflicts with the dependencies (I had to manually change things in configuration files), and to make sure that the project runs smoothly.
This project's RAG uses semantic search using ChromaDB, I'll work on doing a combination of Hybrid Search and a HyDE following the best practices of RAG mentioned in the following paper: [link](https://arxiv.org/html/2407.01219v1#:~:text=A%20typical%20RAG%20workflow%20usually,based%20on%20their%20relevance%20to)
![System Architecture Diagram](src/images/x1.png)
If you encounter an error just hit me up, make a pull request, or report an issue, and I'll happily respond.
### Disadvantages
- For cohere API, it's free for testing and unlimited, but not for production use as it's paid### Next goals
- See if there's a fast and good alternative to cohere api
- Evaluating the performance of this RAG pipeline
- Implement a combination of Hybrid Search and HyDE
- Add Repacking after Reranking, and before giving the prompt back to the model