https://github.com/wizo17/contextual_rag_application
RAG Application with Contextual Retrieval and Lexical Retrieval.
https://github.com/wizo17/contextual_rag_application
bm25 bm25-okapi langchain mlflow-tracking openai-api python rag streamlit
Last synced: 3 months ago
JSON representation
RAG Application with Contextual Retrieval and Lexical Retrieval.
- Host: GitHub
- URL: https://github.com/wizo17/contextual_rag_application
- Owner: Wizo17
- License: mit
- Created: 2025-03-24T22:19:03.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-04-02T15:42:31.000Z (6 months ago)
- Last Synced: 2025-04-02T16:44:09.885Z (6 months ago)
- Topics: bm25, bm25-okapi, langchain, mlflow-tracking, openai-api, python, rag, streamlit
- Language: Python
- Homepage:
- Size: 85.3 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# Contextual Retrieval Augmented Generation Example
## Build with
The project uses:
* [Python](https://www.python.org/)
* [LangChain](https://www.langchain.com/)
* [Okapi BM25](https://fr.wikipedia.org/wiki/Okapi_BM25)
* [FAISS](https://github.com/facebookresearch/faiss)
* [Openai API](https://platform.openai.com/)
* [Anthropic API](https://console.anthropic.com/)
* [GoogleGenerativeAI API](https://aistudio.google.com/)
* [MLFlow Tracking](https://mlflow.org/docs/latest/tracking/)## Architecture

[https://www.anthropic.com/news/contextual-retrieval](https://www.anthropic.com/news/contextual-retrieval)## Setup
### Prerequisites
- Python 3.8+
- pip (Python package manager)### Installation
1. Clone this repository:
```bash
git clone https://github.com/Wizo17/contextual_rag_application.git
cd contextual_rag_application
```2. Create a virtual environment:
```bash
python -m venv venv
```3. Activate the virtual environment:
```bash
# Unix / MacOS
source venv/bin/activate
# Windows
venv\Scripts\activate
```4. Install dependencies:
```bash
pip install -r requirements.txt
```
If you have some issues, use python 3.12.0 and requirements_all.txt5. Create .env file:
```bash
cp .env.example .env
```6. **Update .env file**
## Running app
#### Keep or Delete and import **your own document in data/raw**
#### Start mlflow if MLFLOW_ENABLE = "yes"
Adapt host (MFFLOW_HOST) and port (MFFLOW_PORT)
```bash
mlflow server --host 127.0.0.1 --port 5000
```#### Build index first
It is not necessary if you keep my docs and my indexes
Remove all files in data/index before
```bash
python main.py
```#### Test application
Update queries if you don't keep my docs and my indexes
```bash
python test.py
```By launching streamlit, you can have competitive conflicts with `torch`. It is not a problem for this version.
```bash
streamlit run chatbot.py
```## Authors
* [@wizo17](https://github.com/Wizo17)
## License
This project is licensed under the ``MIT`` License - see [LICENSE](LICENSE.md) for more information.