https://github.com/dudeperf3ct/reliable-agentic-rag
Project exploring RAG LLM agents
https://github.com/dudeperf3ct/reliable-agentic-rag
llm-agent rag
Last synced: 3 months ago
JSON representation
Project exploring RAG LLM agents
- Host: GitHub
- URL: https://github.com/dudeperf3ct/reliable-agentic-rag
- Owner: dudeperf3ct
- License: mit
- Created: 2024-10-16T07:22:13.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2024-11-11T06:47:20.000Z (6 months ago)
- Last Synced: 2025-01-15T21:55:24.606Z (4 months ago)
- Topics: llm-agent, rag
- Language: Python
- Homepage:
- Size: 507 KB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Reliable Agentic RAG with LLM Trustworthiness Estimates
This project is an exploration to understand RAG LLM agents and attempt to replicate the blog titled "Reliable Agentic RAG with LLM Trustworthiness Estimates" :
Tested this on machine with following configuration
```txt
Python - 3.12
uv - 0.4.25
GPU - Nvidia GeForce RTX 3060 Mobile
OS - Ubuntu 22.04.5 LTS
```## Getting Started
Install [uv](https://docs.astral.sh/uv/).
Create a virtual environment using `uv`
```bash
uv venv --python 3.12
source .venv/bin/activate
```Install the dependencies required to run this project inside the virtual environment.
```bash
uv sync
```### Download dataset
We will work with documentation of Nvidia on Triton Inference Server:
As part of the script below, we will download all the links in html format using `wget`.
```bash
cd scripts
bash get_data.sh
```### Run data pipeline
```bash
cd ..pwd
/home/username/reliable-agentic-rag
```> [!Note]
> Make sure you are running the following command from the root of the project (inside `reliable-agentic-rag` folder).```bash
python agentic_rag/run.py data
```Running this command creates a `milvus.db` file that acts as a knowledge base. To know more about data pipeline, refer to the documentation on [Datapipeline](./docs/Datapipeline.md).
> [!TIP]
> The documentation for parameters that can be configuration as part of data pipeline [here](./docs/Datapipeline.md#configuration).### Run query pipeline
```bash
python agentic_rag/run.py query --query-text "How to make custom layers of TensorRT work in Triton?"
```> [!WARNING]
> One (or two) API key(s) should be added to the `.env` file.
> Trustworthy Language Model (TLM) by cleanlab.ai to estimate trustworthy score. Get API key from here: after creating an account.
> API key for an LLM is required to be added in the `.env` file.
> If LLM is hosted locally, no API key is required. Configure only `LLM_MODEL` and `LLM_API_BASE` parameters.
> If LLM is closed-source, API key is required to be added in the `.env` file.For more information on how to configure various LLM providers, refer the [documentation](./docs/Querypipeline.md#llm).
> [!TIP]
> The documentation for parameters that can be configuration as part of query pipeline [here](./docs/Querypipeline.md#configuration).## Documentation
### Data pipline
Documentation: [docs](./docs/Datapipeline.md)
### Query pipline
Documentation: [docs](./docs/Querypipeline.md)
## Agentic RAG
Strictly, the approach implemented as part of this project is not an agentic RAG approach. We are manually providing the list of retrieval strategies, calling the necessary functions for the corresponding strategy and selecting the next strategy depending on the trustworthiness score. To implement a truly autonomus agentic RAG approach, one approach outlined in [this blog](https://vectorize.io/how-i-finally-got-agentic-rag-to-work-right/) is to use JSON mode for structred responses or creating multiple agents to collaborate.
Some questions of my own
- Will this fully autonomous agentic RAG approach outperform the current semi-automated RAG approach? What are advantages of using one over the other?
- In current RAG approach, what are different approaches that can be used to replace the [uncertainity estimator](./docs/Querypipeline.md#uncertainity-estimator) component?
- Is this RAG approach reliable and robust to all scenarios?## Recommended Readings
- [Reliable Agentic RAG with LLM Trustworthiness Estimates](https://pub.towardsai.net/reliable-agentic-rag-with-llm-trustworthiness-estimates-c488fb1bd116)
- [How I finally got agentic RAG to work right](https://vectorize.io/how-i-finally-got-agentic-rag-to-work-right/)## Further work
- [ ] Implement [Contextual Retrieval](https://www.anthropic.com/news/contextual-retrieval) by Anthropic. It will require changes to data pipeline logic.
- [ ] Support for GraphRAG, Query Rewriting or Multi-Hop RAG as an additional retrieval strategy
- [ ] Implement a [agentic RAG processing loop](https://vectorize.io/how-i-finally-got-agentic-rag-to-work-right/) (explained in Agentic RAG processing loop section of the link)
- [ ] Make configuration more intutive (using `pydantic`)