Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/aktersnurra/rag
Retrieval Augmented Generation
https://github.com/aktersnurra/rag
cohere ollama qdrant-vector-database retrieval-augmented-generation
Last synced: about 6 hours ago
JSON representation
Retrieval Augmented Generation
- Host: GitHub
- URL: https://github.com/aktersnurra/rag
- Owner: aktersnurra
- Created: 2024-04-03T20:03:14.000Z (9 months ago)
- Default Branch: master
- Last Pushed: 2024-04-13T11:55:16.000Z (8 months ago)
- Last Synced: 2024-04-14T00:44:25.751Z (8 months ago)
- Topics: cohere, ollama, qdrant-vector-database, retrieval-augmented-generation
- Language: Python
- Homepage:
- Size: 687 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Retrieval Augmented Generation
RAG with ollama (and optionally cohere) and qdrant. This is basically a glorified
(bloated) `grep`.## Usage
### Setup
#### 1. Environment Variables
Create a .env file or set the following parameters:
```.env
CHUNK_SIZE=4096
CHUNK_OVERLAP=256ENCODER_MODEL=nomic-embed-text
EMBEDDING_DIM=768
RETRIEVER_TOP_K=15
RETRIEVER_SCORE_THRESHOLD=0.5RERANK_MODEL=mixedbread-ai/mxbai-rerank-large-v1
RERANK_TOP_K=5GENERATOR_MODEL=llama3
DOCUMENT_DB_NAME=rag
DOCUMENT_DB_USER=aktersnurraQDRANT_URL=http://localhost:6333
QDRANT_COLLECTION_NAME=knowledge-baseCOHERE_API_KEY = # OPTIONAL
COHERE_RERANK_MODEL = "rerank-english-v3.0"
```#### 2. Install Python Dependencies
```
poetry install
```#### 3. Ollama
Make sure ollama is running:
```sh
ollama serve
```Download the encoder and generator models with ollama:
```sh
ollama pull $GENERATOR_MODEL
ollama pull $ENCODER_MODEL
```#### 4. Qdrant
Qdrant is used to store the embeddings of the chunks from the documents.
Download and run qdrant.
#### 5. Postgres
Postgres is used to save hashes of the document to prevent documents from
being added to the vector db more than ones.Download and run qdrant.
#### 6. Cohere
Get an API from their website, but is optional.
### Running
Activate the poetry shell:
```sh
poetry shell
```Use the cli:
```sh
python rag/cli.py
```or the ui using a browser:
```sh
streamlit run rag/ui.py
```### Notes
Yes, it is inefficient/dumb to use ollama when you can just load the models with python
in the same process.### TODO
- [x] Rerank history if it is relevant.
- [x] message ollama/cohere
- [x] create db script
- [x] write a general model for cli/ui
- [ ] use huggingface instead of ollama
- [ ] Refactor messages
### Inspiration
I took some inspiration from these tutorials:
[rag-openai-qdrant](https://colab.research.google.com/github/qdrant/examples/blob/master/rag-openai-qdrant/rag-openai-qdrant.ipynb)
[building-rag-application-using-langchain-openai-faiss](https://medium.com/@solidokishore/building-rag-application-using-langchain-openai-faiss-3b2af23d98ba)
[knowledge_gpt](https://github.com/mmz-001/knowledge_gpt)