Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/parthsareen/simple-rag
Too many docs? Quickly search over any PDF or Markdown documents
https://github.com/parthsareen/simple-rag
llama local-llm obsidian obsidian-md ollama python rag
Last synced: 21 days ago
JSON representation
Too many docs? Quickly search over any PDF or Markdown documents
- Host: GitHub
- URL: https://github.com/parthsareen/simple-rag
- Owner: ParthSareen
- Created: 2024-09-23T18:22:37.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2024-11-30T22:15:35.000Z (24 days ago)
- Last Synced: 2024-11-30T23:23:54.815Z (24 days ago)
- Topics: llama, local-llm, obsidian, obsidian-md, ollama, python, rag
- Language: Python
- Homepage:
- Size: 279 KB
- Stars: 10
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Simple RAG
Run RAG (Retreival Augment Generation) for any documents!Hey there! π RAG-in-a-box is your go-to tool for quickly setting up Retrieval Augmented Generation (RAG) on your docs. It's perfect for when you're tired of uploading docs to chatgpt or want everything to be saved locally.
- π Handles PDFs and Markdown files
- π€ Works with OpenAI and Ollama models
- π₯οΈ Easy-to-use interface s/o Gradio
- π» CLI for you command-line lovers## Prerequisites
1. Python 3.10 or higher
2. pip or rye package manager### Installation
Choose one of the following methods:
- Using pip:
```
pip install -r requirements.lock
```- Using rye:
```
rye sync
```### API Keys
- For OpenAI models: Set the `OPENAI_API_KEY` environment variable with your OpenAI API key.
- For Ollama models: Ensure Ollama is installed and running on your system.# Using the Gradio Interface
`python3 src/rag_in_a_box/interface.py`
# Using the CLI
To use RAG-in-a-box from the command line, you can utilize the `main.py` script. Hereβs how you can get started:
1. **Basic Usage:**
```
python3 src/rag_in_a_box/main.py --path
```Replace `` with the path to the directory containing your PDF or Markdown files.
2. **Specify Document Loader Type:**
By default, the script assumes you are loading PDF documents. If you want to load Markdown files, use the `--loader_type` argument:
```
python3 src/rag_in_a_box/main.py --loader_type md --path
```3. **Persisting the Vector Database:**
You can specify a custom path to persist the vector database using the `--persist_path` argument:
```
python3 src/rag_in_a_box/main.py --path --persist_path
```4. **Choosing the Model:**
The script supports both OpenAI and Ollama models. You can specify the model to use with the `--model` argument:
```
python3 src/rag_in_a_box/main.py --path --model
```5. **Interactive Q&A:**
Once the documents are loaded and stored, the script enters an interactive Q&A loop. Simply type your questions, and the system will provide answers based on the loaded documents. Type `quit` to exit the loop.Example command:
```
python3 src/rag_in_a_box/main.py --loader_type pdf --path ./docs --persist_path ./chroma_db --model gpt-4o-mini-2024-07-18
```This command loads PDF documents from the `./docs` directory, stores them in the `./chroma_db` vector database, and uses the `gpt-4o-mini-2024-07-18` model for Q&A.