An open API service indexing awesome lists of open source software.

https://github.com/grepory/ragu

RAGU - RAG Utility for interrogating documents
https://github.com/grepory/ragu

ai claude-code llm

Last synced: 14 days ago
JSON representation

RAGU - RAG Utility for interrogating documents

Awesome Lists containing this project

README

          

# RAGU: Retrieval-Augmented Generation Utility

RAGU is a complete RAG (Retrieval-Augmented Generation) management and interrogation system built with FastAPI and ChromaDB. It provides a powerful API for document management, vector storage, and LLM-powered chat functionality with support for multiple LLM providers.

## Features

- **Document Management**: Upload, process, and manage documents in vector collections
- **Document Tagging**: Add tags to documents for better organization and filtering
- **Vector Database**: ChromaDB integration for efficient semantic search
- **LLM Integration**: Support for multiple LLM providers:
- Ollama with local or remote models (including nomic-embed-text)
- Anthropic with Claude models (including haiku)
- OpenAI with GPT models (including 4o)
- **Conversational UI**: Interactive chat interface with conversation history and context awareness
- **WebSocket Chat**: Real-time, streaming chat interface with RAG capabilities
- **REST API**: Comprehensive REST API for all operations
- **Document Processing**: Support for various document types (PDF, DOCX, CSV, text)

## Installation

### Prerequisites

- Python 3.8+
- [Optional] Virtual environment (recommended)

### Setup

1. Clone the repository:
```bash
git clone https://github.com/grepory/ragu.git
cd ragu
```

2. Create and activate a virtual environment (optional but recommended):
```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```

3. Install dependencies:
```bash
pip install -r requirements.txt
```

4. Copy the `.env.example` file to `.env` in the project root and update it with your LLM provider configuration:
```bash
cp .env.example .env
# Then edit the .env file with your preferred text editor
```

The `.env.example` file contains all the necessary environment variables with default values and explanatory comments. You only need to configure the providers you plan to use.

## Running the Application

Start the server with:

```bash
python run_app.py
```

This will:
- Start the server on the configured port (default 8000)
- Display clear URLs for accessing the application
- Automatically open your default web browser to the frontend

Alternatively, you can start the server manually with:

```bash
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
```

The API will be available at http://localhost:8000, and the API documentation at http://localhost:8000/docs.

## Configuration

RAGU can be configured through environment variables in the `.env` file:

### Server Configuration

- `PORT=8000` - Port for the FastAPI server
- `ROOT_PATH=/ragu` - Root path prefix for reverse proxy setup (e.g., for hosting at http://example.com/ragu/)

### LLM Provider Configuration

Configure your preferred LLM providers in the `.env` file:

## API Documentation

### Collections API

Collections are used to organize documents in the vector database.

#### Create a Collection

```http
POST /api/v1/collections/
Content-Type: application/json

{
"name": "my_collection",
"description": "My first collection"
}
```

#### List Collections

```http
GET /api/v1/collections/
```

#### Get Collection

```http
GET /api/v1/collections/{collection_name}
```

#### Delete Collection

```http
DELETE /api/v1/collections/{collection_name}
```

### Documents API

Documents are the core content stored in the vector database.

#### Upload a Document

```http
POST /api/v1/documents/upload
Content-Type: multipart/form-data

file: [file]
collection_name: my_collection
tags: personal,house,important
additional_metadata: {"source": "website", "author": "John Doe"}
```

The `tags` parameter is optional and should be a comma-separated list of tags to associate with the document.

#### Add Text Directly

```http
POST /api/v1/documents/text
Content-Type: application/json

{
"text": "This is some text to add to the vector database.",
"collection_name": "my_collection",
"tags": ["personal", "notes", "important"],
"metadata": {"source": "direct_input", "author": "Jane Smith"}
}
```

The `tags` parameter is optional and should be an array of strings representing tags to associate with the document.

#### Get Document

```http
GET /api/v1/documents/{collection_name}/{document_id}
```

#### Delete Document

```http
DELETE /api/v1/documents/{collection_name}/{document_id}
```

#### Query Documents

```http
POST /api/v1/documents/query
Content-Type: application/json

{
"collection_name": "my_collection",
"query_text": "What is retrieval-augmented generation?",
"n_results": 5,
"where": {
"tags": {"$in": ["personal", "important"]}
}
}
```

The `where` parameter is optional and can be used to filter documents by metadata fields, including tags. In the example above, the query will only return documents that have either the "personal" or "important" tag. You can use other operators like `$eq` for exact match or `$contains` for substring match.

### Chat API

The Chat API provides both REST and WebSocket interfaces for interacting with the LLM.

#### Model Selection

You can specify which LLM provider and model to use by setting the `model` parameter in your requests. The format is:

```
provider:model
```

For example:
- `ollama:llama2` - Use Ollama with the llama2 model
- `anthropic:claude-3-haiku-20240307` - Use Anthropic with the Claude 3 Haiku model
- `openai:gpt-4o` - Use OpenAI with the GPT-4o model

If you don't specify a provider, the default provider from your configuration will be used. If you don't specify a model, the default model for the selected provider will be used.

#### REST Chat Endpoint

```http
POST /api/v1/chat/
Content-Type: application/json

{
"collection_name": "my_collection",
"query": "What is retrieval-augmented generation?",
"history": [
{"role": "user", "content": "Hello"},
{"role": "assistant", "content": "Hi there! How can I help you?"}
],
"model": "llama2"
}
```

#### WebSocket Chat

Connect to the WebSocket endpoint at `/api/v1/chat/ws` and send JSON messages with the following format:

```json
{
"type": "chat",
"collection_name": "my_collection",
"query": "What is retrieval-augmented generation?",
"history": [
{"role": "user", "content": "Hello"},
{"role": "assistant", "content": "Hi there! How can I help you?"}
],
"model": "llama2"
}
```

The server will respond with messages in the following formats:

1. Start message:
```json
{
"type": "start",
"content": "Generating response..."
}
```

2. Token messages (streaming):
```json
{
"type": "token",
"content": "piece of text"
}
```

3. Complete message:
```json
{
"type": "complete",
"content": {
"answer": "The complete answer...",
"sources": [
{
"id": "document_id",
"text": "Snippet of the source document...",
"metadata": {"source": "document.pdf", "page": 1}
}
],
"history": [
{"role": "user", "content": "What is RAG?"},
{"role": "assistant", "content": "The complete answer..."}
]
}
}
```

4. Error message:
```json
{
"type": "error",
"content": "Error message"
}
```

## Web Interface

RAGU provides a responsive web interface that works seamlessly on desktop and mobile devices. The interface includes:

### Document Upload

- Upload files using the upload button
- Add tags to organize your documents
- View upload progress and status

### Conversational Chat Interface

The chat interface allows you to have natural conversations with the AI about your documents:

- **Conversation History**: The system maintains the full conversation history, allowing for contextual follow-up questions
- **Conversation Sidebar**: Browse and manage your chat history with a collapsible sidebar
- **Enter Key Support**: Press Enter to send your message (use Shift+Enter for new lines)
- **Visual Distinction**: User and assistant messages are visually distinct for easy reading
- **Source Citations**: The system displays the source documents used to generate responses
- **Mobile Responsive**: Optimized interface for mobile devices with touch-friendly controls
- **Tag-based Filtering**: Filter conversations and documents by tags with advanced search capabilities

To use the conversational interface:

1. Select a a set of tags from the dropdown (or choose to search all tags)
2. (Optional) Select a specific LLM model and provider
3. Type your question in the message box
4. Press Enter or click "Send Message"
5. View the AI's response and the sources used
6. Continue the conversation with follow-up questions

The system will maintain context between questions, allowing for a more natural conversation flow. On mobile devices, the conversation sidebar can be toggled using the sidebar button in the top navigation.

## Examples

### Python Client Example

Here's a simple Python client for interacting with the API:

```python
import requests
import websockets
import json
import asyncio

BASE_URL = "http://localhost:8000/api/v1"

# Create a collection
def create_collection(name, description=None):
response = requests.post(
f"{BASE_URL}/collections/",
json={"name": name, "description": description}
)
return response.json()

# Upload a document
def upload_document(file_path, collection_name, tags=None, metadata=None):
files = {"file": open(file_path, "rb")}
data = {"collection_name": collection_name}
if tags:
data["tags"] = ",".join(tags)
if metadata:
data["additional_metadata"] = json.dumps(metadata)

response = requests.post(
f"{BASE_URL}/documents/upload",
files=files,
data=data
)
return response.json()

# Query documents
def query_documents(collection_name, query_text, n_results=5, tags=None):
query_data = {
"collection_name": collection_name,
"query_text": query_text,
"n_results": n_results
}

# Add tag filtering if specified
if tags:
query_data["where"] = {"tags": {"$in": tags}}

response = requests.post(
f"{BASE_URL}/documents/query",
json=query_data
)
return response.json()

# WebSocket chat client
async def chat_websocket(query, tags=None, history=None, model=None, include_untagged=True):
uri = f"ws://localhost:8000/api/v1/chat/ws"
async with websockets.connect(uri) as websocket:
# Send chat message
message = {
"type": "chat",
"query": query,
"history": history or [],
"include_untagged": include_untagged
}
if tags:
message["tags"] = tags
if model:
message["model"] = model

await websocket.send(json.dumps(message))

# Process responses
full_response = ""
while True:
response = await websocket.recv()
data = json.loads(response)

if data["type"] == "token":
# Print token as it arrives
print(data["content"], end="", flush=True)
full_response += data["content"]

elif data["type"] == "complete":
# Chat complete
print("\n\nSources:")
for i, source in enumerate(data["content"]["sources"]):
print(f"{i+1}. {source['text']} (ID: {source['id']})")

return data["content"]

elif data["type"] == "error":
# Error occurred
print(f"\nError: {data['content']}")
return None

# Example usage
if __name__ == "__main__":
# Create a collection
create_collection("my_docs", "My documents collection")

# Upload a document with tags
upload_document(
"path/to/document.pdf",
"my_docs",
tags=["house", "important", "insurance"],
metadata={"source": "local", "author": "Insurance Company"}
)

# Upload another document with different tags
upload_document(
"path/to/another_document.pdf",
"my_docs",
tags=["personal", "important", "legal"],
metadata={"source": "local", "author": "Law Firm"}
)

# Query documents without tag filtering
results = query_documents("my_docs", "What is RAG?")

# Query documents with tag filtering (only "house" tagged documents)
house_results = query_documents("my_docs", "What is RAG?", tags=["house"])

# Query documents with multiple tag filtering (documents tagged as either "important" or "personal")
important_results = query_documents("my_docs", "What is RAG?", tags=["important", "personal"])

# Chat with WebSocket - Single query
asyncio.run(chat_websocket("Explain RAG in simple terms."))

# Chat with WebSocket - Query with tag filtering
asyncio.run(chat_websocket(
"What can you tell me about my house documents?",
tags=["house", "important"]
))

# Chat with WebSocket - Conversation with history
async def conversation_example():
# First query
print("\n--- Starting conversation ---")
response = await chat_websocket("Explain RAG in simple terms.")

# Get the updated history from the response
conversation_history = response["history"]

# Follow-up query using the conversation history with tag filtering
print("\n--- Follow-up question with tag filtering ---")
response = await chat_websocket(
"What are the main benefits compared to traditional approaches?",
tags=["important", "personal"],
history=conversation_history
)

# Continue the conversation with another follow-up
conversation_history = response["history"]
print("\n--- Another follow-up question ---")
await chat_websocket(
"Can you give me a simple code example?",
history=conversation_history
)

# Run the conversation example
asyncio.run(conversation_example())
```

## License

MIT

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.