Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/patw/ragtag
A tool for testing RAG functionality with Atlas Vector Search
https://github.com/patw/ragtag
atlas mongodb vector
Last synced: 29 days ago
JSON representation
A tool for testing RAG functionality with Atlas Vector Search
- Host: GitHub
- URL: https://github.com/patw/ragtag
- Owner: patw
- License: bsd-2-clause
- Created: 2023-10-20T12:27:26.000Z (about 1 year ago)
- Default Branch: master
- Last Pushed: 2024-10-24T13:41:37.000Z (2 months ago)
- Last Synced: 2024-10-25T15:44:24.974Z (2 months ago)
- Topics: atlas, mongodb, vector
- Language: Python
- Homepage:
- Size: 188 KB
- Stars: 7
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# RAGTAG
A tool for manual RAG chunk entry for question/answer systems. Create, search or edit text chunks paired
up with questions to ensure good retrieval for embeddings.Takes advantage of Atlas Mongo and Atlas Vector Search
RAGTAG allows you to:
* Create Q/A chunks for use with your chatbots
* Vectorize chunks with open source embedding models (Instructor-large)
* Search existing chunks, edit chunk, update embedding and update chatbot in real time
* Test your chunks for recall by using real questions![RAGTAG UI Screenshot](images/ragtag_ui.png)
## Installation
```pip install -r requirements.txt```
## Downloading the Mistral 7b model (with dolphin fine tune)
```wget https://huggingface.co/TheBloke/dolphin-2.1-mistral-7B-GGUF/resolve/main/dolphin-2.1-mistral-7b.Q5_K_S.gguf```
## Running App
Copy sample.env to .env and modify with connection string to your Atlas instance
```flask run```
**WARNING: You will need about 20 gigs of ram to run this process! Mistral-7b requires 14 gig with the Q5 quantization, and instructor needs 4 gig on it's own**
## Atlas Search Index
Create and Atlas Search index, in the Atlas UI under the Search tab for the "chunks" collection
under the "ragtag" database.```
{
"analyzer": "lucene.english",
"searchAnalyzer": "lucene.english",
"mappings": {
"dynamic": false,
"fields": {
"chunk_answer": {
"type": "string"
},
"chunk_embedding": {
"dimensions": 768,
"similarity": "cosine",
"type": "knnVector"
},
"chunk_enabled": {
"type": "boolean"
},
"chunk_question": {
"type": "string"
}
}
}
}
```