https://github.com/rupeshtr78/chroma-db-rag
Chroma DB vector database, with embedding and reranker models to implement a Retrieval Augmented Generation (RAG) system.
https://github.com/rupeshtr78/chroma-db-rag
chromadb embeddings golang huggingface-models huggingface-transformers ollama rag re-ranking reranking text-embeddings-inference vector-database
Last synced: about 4 hours ago
JSON representation
Chroma DB vector database, with embedding and reranker models to implement a Retrieval Augmented Generation (RAG) system.
- Host: GitHub
- URL: https://github.com/rupeshtr78/chroma-db-rag
- Owner: rupeshtr78
- License: bsd-3-clause
- Created: 2024-07-22T00:06:20.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-04-29T20:51:57.000Z (5 months ago)
- Last Synced: 2025-04-29T21:37:11.999Z (5 months ago)
- Topics: chromadb, embeddings, golang, huggingface-models, huggingface-transformers, ollama, rag, re-ranking, reranking, text-embeddings-inference, vector-database
- Language: Go
- Homepage:
- Size: 13.8 MB
- Stars: 5
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Retrieval Augmented Generation with VectorDb, Hugging Face Embededders and Re-rankers
**Repository Overview**
This repository demonstrates the integration of Chroma DB, a vector database, with embedding models to develop a robust Retrieval Augmented Generation (RAG) system.
**Embedding Model Options**
1. **Ollama Embedding Model**:
2. **Hugging Face Text Embedder**:
3. **OpenAI Embedding Model**:**Re-ranker Integration (http, grpc)**
To enhance the accuracy of RAG, we can incorporate HuggingFace Re-rankers models. These models evaluate the similarity between a query and query results retreived from vectordb, Re-Ranker rank the results by index ensuring that retrieved information is relevant and contextually accurate.
```bash
Example:
query := "What is Deep Learning?"
retrievedResults := []string{"Tomatos are fruits...", "Deep Learning is not...", "Deep learning is..."}
Response: [{"index":2,"score":0.9987814},{"index":1,"score":0.022949383},{"index":0,"score":0.000076250595}]
```This repository demonstrates how to combine embedding and reranking to develop a RAG system.
## Steps followed to Implement this RAG System
1. **Set Up Vector Database**:
- Use Chroma DB to store your document embeddings.
- Support for Ollama embedding models and Hugging Face [Tei](https://huggingface.co/docs/text-embeddings-inference/en/index).2. **Preprocess Documents**:
- Split your documents into manageable chunks.
- Generate embeddings for each chunk using an embedding model such as "nomic-embed-text" from Ollama.3. **Store Embeddings**:
- Store the chunks and their corresponding embeddings in the Chroma DB vector database.4. **Query Processing**:
- When you have a query:
- Generate an embedding for the query.
- Perform a similarity search within the vector database to identify the most relevant chunks based on their embeddings.
- Retrieve these chunks as context for your query.
- **Rerank** the results using Hugging Face [Reranker](https://huggingface.co/docs/text-embeddings-inference/en/quick_tour#re-rankers)5. **Integrate with LLM Provider**:
- Supported LLM Providers
- **Ollama**
- **OpenAi**6. **Create Prompt Template**:
- Design a prompt template that incorporates both the original query and the context retrieved from the vector database.7. **Process with LLM**:
- Send the augmented prompt, including the query and reranked context, to the Large Language Model (LLM) for processing and generation of responses.This allows to enhance language processing tasks by leveraging the power of vector databases and advanced embedding models.
## Sample Results
```txt
<|user|> what is mirostat_tau?:-
Based on the provided content, I can answer your query.**Query Result:** Mirostat_tau Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)
**Document Content:**
mirostat_tau Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)
float
mirostat_tau 5.0**Additional Information on this Topic:**
Here are three main points related to Mirostat_tau:
1. **Coherence vs Diversity:** Mirostat_tau controls the balance between coherence and diversity of the output, which means it determines how focused or creative the generated text will be.
2. **Lower Values Mean More Focus:** A lower value for mirostat_tau results in more focused and coherent text, while a higher value allows for more diverse and potentially less coherent output.
3. **Default Value:** The default value for Mirostat_tau is 5.0, which means that if no specific value is provided, the model will generate text with a balance between coherence and diversity.Please note that these points are based solely on the provided content and do not go beyond it.%
```
## Getting Started### Prerequisites
- Go (>=1.22.0)
- Docker
- Docker Compose### Installation
1. **Clone the Repository**
```sh
git clone https://github.com/yourusername/chroma-db.git
cd chroma-db
```2. **Install Go Packages**
3. **Build the Go Project**```sh
go build -o chroma-db cmd/main.go
```4. **Set Up Docker Containers**
Ensure Docker and Docker Compose are installed. Use the `docker-compose.yaml` to set up the Chroma DB service.
```sh
docker-compose up -d
```### Running the Project
```sh
./chroma-db
Usage
-load
Load and embed the data in vectordb
Provide the path to file Eg: "test/model_params.txt"
-query
Query the embedded data and rerank the results
Provide the query Eg:"what is the difference between mirostat_tau and mirostat_eta?"
```## Project Structure
- **cmd/**:
- **main.go**: Entry point for running the Chroma DB.
- **chat/**:
- **ollama_chat.go**: Contains the logic for interacting with the Ollama chat model.- **internal/constants/**:
- **constants.go**: Houses all the necessary constants used across the project.- **docker-compose.yaml**: Docker Compose configuration file for setting up the Chroma DB service.
## Configuration
Adjust configuration values in `internal/constants/constants.go` to fit your needs. This includes settings like:
Chroma DB URL, Tenant name, Database & Namespace.
Ollama model type and URL.### Prompt Go Template
```go
<|system|> {{ .SystemPrompt }}
<|content|> {{ .Content }}
<|user|> {{ .Prompt }}
```#### Running VectorDB
Start the VectorDb with the following command:
```sh
docker compose up
```#### Chat with Ollama
Execute chat-related operations:
```sh
go run ./cmd/main.go
```## Configuration
Default configuration values are provided in `internal/constants/constants.go` and can be adjusted as per your needs. Some of these include:
- `ChromaUrl`, `TenantName`, `Database`, `Namespace`
- `OllamaModel` and `OllamaUrl`### License
This project is licensed under the BSD 3-Clause License - see the [LICENSE](./LICENSE) file for details.
## Acknowledgments
- [Chroma DB](https://github.com/chroma-db)
- [Ollama](https://ollama-ai.com)For any issues or contributions, please open an issue or submit a pull request on GitHub.