https://github.com/edcalderin/huggingface_ragflow
This project implements a classic Retrieval-Augmented Generation (RAG) system using HuggingFace models with quantization techniques. The system processes PDF documents, extracts their content, and enables interactive question-answering through a Streamlit web application.
https://github.com/edcalderin/huggingface_ragflow
bitsandbytes cuda huggingface huggingface-embeddings langchain langchain-community large-language-models llm nf4 python qdrant quantization rag retrieval-augmented-generation ruff streamlit text-generation
Last synced: 7 months ago
JSON representation
This project implements a classic Retrieval-Augmented Generation (RAG) system using HuggingFace models with quantization techniques. The system processes PDF documents, extracts their content, and enables interactive question-answering through a Streamlit web application.
- Host: GitHub
- URL: https://github.com/edcalderin/huggingface_ragflow
- Owner: edcalderin
- License: apache-2.0
- Created: 2025-03-11T21:01:33.000Z (11 months ago)
- Default Branch: master
- Last Pushed: 2025-03-20T20:10:14.000Z (11 months ago)
- Last Synced: 2025-03-20T20:54:54.071Z (11 months ago)
- Topics: bitsandbytes, cuda, huggingface, huggingface-embeddings, langchain, langchain-community, large-language-models, llm, nf4, python, qdrant, quantization, rag, retrieval-augmented-generation, ruff, streamlit, text-generation
- Language: Python
- Homepage:
- Size: 111 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# HuggingFace RAGFlow
[](https://github.com/astral-sh/ruff)
## Overview
This project implements a classic Retrieval-Augmented Generation (RAG) system using HuggingFace models with quantization techniques. The system processes PDF documents, extracts their content, and enables interactive question-answering through a Streamlit web application.
## Prerequisites
- [Anaconda](https://www.anaconda.com/download/) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html) installed on your system
- Python 3.12 or higher
## Installation
### 1. Clone the repository
```bash
git clone https://github.com/edcalderin/HuggingFace_RAGFlow.git
cd HuggingFace_RAGFlow
```
### 2. Create and activate the Conda environment
```bash
# Create a new Conda environment
conda env create -n hg_ragflow --file requirements.txt
# Activate the environment
conda activate hg_ragflow
```
On Windows, you might need to use:
```bash
source activate hg_ragflow
```
If you have GPU
```
pip3 install torch --index-url https://download.pytorch.org/whl/cu126
```
### 3. Verify the installation
```bash
# Verify that the environment is active
conda info --envs
# The active environment should be marked with an asterisk (*)
```
## Usage
### Development workflow
1. Rename `.env.example` to `.env` and set the `HUGGINGFACE_TOKEN` variable with your own HuggingFace token https://huggingface.co/settings/tokens
2. Load embeddings to Qdrant Vector Store:
```bash
python -m core.data_loader.vector_store
```
3. Run Streamlit app:
```bash
python -m streamlit run app/streamlit.py
```
### Configuration
Located `core/config.py` and feel free to edit these global parameters:
```python
@dataclass(frozen=True)
class LLMConfig:
EMBEDDING_MODEL_NAME: str = "sentence-transformers/all-mpnet-base-v2" <-- embedding model
COLLECTION_NAME: str = "historiacard_docs"
QDRANT_STORE_PATH: str = "./tmp" <-- directory to Qdrant vector store
# Model
MODEL_NAME: str = "meta-llama/Llama-3.2-3B-Instruct"
MODEL_TASK: str = "text-generation" <-- task type
TEMPERATURE: float = 0.1
MAX_NEW_TOKENS: int = 1024
```
### Lint
Style the code with Ruff:
```bash
ruff format .
ruff check . --fix
```
### Deactivating the environment
When you're done working on the project, deactivate the Conda environment:
```bash
conda deactivate
```
**Last but not least:**
Locate you cache directory and remove embedding and model directory used by the project, as these may occupy several gigabytes of storage.
## Environment Configuration
### Requirements
The project includes an `requirements.txt` file that defines all required dependencies. Here's what it looks like:
```bash
accelerate==1.5.2
bitsandbytes==0.45.3
langchain-community==0.3.19
langchain-core==0.3.44
langchain-huggingface==0.1.2
langchain-qdrant==0.2.0
pypdf==5.3.1
python-dotenv==1.0.1
ruff==0.9.10
streamlit==1.43.2
torch==2.6.0+cu126
transformers==4.49.0
```
## Project Structure
```
HuggingFace_RAGFlow/
├── app/ # Streamlit app
│ ├── streamlit.py # Main application entry point
├── core/ # LLM stuff
│ ├── chain_creator/ # Files to create conversational chain and memory management
│ └── data_loader/ # Files to save embeddings to Vector Store.
│ └── model/ # LLM Model and Embeddings
│ └── retrieval/ # Vector Store Retriever
│ └── utils/ # Logging configuration
│ └── config.py # Global configuration parameters
└── README.md # This file
```
## Contact
**LinkedIn:** https://www.linkedin.com/in/erick-calderin-5bb6963b/
**e-mail:** edcm.erick@gmail.com
Just in case, feel free to create an issue 😊
## Enjoyed this content?
Explore more of my work on [Medium](https://medium.com/@erickcalderin)
I regularly share insights, tutorials, and reflections on tech, AI, and more. Your feedback and thoughts are always welcome!