https://github.com/edcalderin/huggingface_ragflow

This project implements a classic Retrieval-Augmented Generation (RAG) system using HuggingFace models with quantization techniques. The system processes PDF documents, extracts their content, and enables interactive question-answering through a Streamlit web application.
https://github.com/edcalderin/huggingface_ragflow

bitsandbytes cuda huggingface huggingface-embeddings langchain langchain-community large-language-models llm nf4 python qdrant quantization rag retrieval-augmented-generation ruff streamlit text-generation

Last synced: 7 months ago
JSON representation

Host: GitHub
URL: https://github.com/edcalderin/huggingface_ragflow
Owner: edcalderin
License: apache-2.0
Created: 2025-03-11T21:01:33.000Z (11 months ago)
Default Branch: master
Last Pushed: 2025-03-20T20:10:14.000Z (11 months ago)
Last Synced: 2025-03-20T20:54:54.071Z (11 months ago)
Topics: bitsandbytes, cuda, huggingface, huggingface-embeddings, langchain, langchain-community, large-language-models, llm, nf4, python, qdrant, quantization, rag, retrieval-augmented-generation, ruff, streamlit, text-generation
Language: Python
Homepage:
Size: 111 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# HuggingFace RAGFlow
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
## Overview
This project implements a classic Retrieval-Augmented Generation (RAG) system using HuggingFace models with quantization techniques. The system processes PDF documents, extracts their content, and enables interactive question-answering through a Streamlit web application.

## Prerequisites
- [Anaconda](https://www.anaconda.com/download/) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html) installed on your system
- Python 3.12 or higher

## Installation

### 1. Clone the repository
```bash
git clone https://github.com/edcalderin/HuggingFace_RAGFlow.git
cd HuggingFace_RAGFlow
```

### 2. Create and activate the Conda environment
```bash
# Create a new Conda environment
conda env create -n hg_ragflow --file requirements.txt

# Activate the environment
conda activate hg_ragflow
```

On Windows, you might need to use:
```bash
source activate hg_ragflow
```

If you have GPU
```
pip3 install torch --index-url https://download.pytorch.org/whl/cu126
```

### 3. Verify the installation
```bash
# Verify that the environment is active
conda info --envs

# The active environment should be marked with an asterisk (*)
```

## Usage

### Development workflow
1. Rename `.env.example` to `.env` and set the `HUGGINGFACE_TOKEN` variable with your own HuggingFace token https://huggingface.co/settings/tokens

2. Load embeddings to Qdrant Vector Store:
```bash
python -m core.data_loader.vector_store
```

3. Run Streamlit app:
```bash
python -m streamlit run app/streamlit.py
```

### Configuration
Located `core/config.py` and feel free to edit these global parameters:

```python
@dataclass(frozen=True)
class LLMConfig:
EMBEDDING_MODEL_NAME: str = "sentence-transformers/all-mpnet-base-v2" <-- embedding model
COLLECTION_NAME: str = "historiacard_docs"
QDRANT_STORE_PATH: str = "./tmp" <-- directory to Qdrant vector store

# Model
MODEL_NAME: str = "meta-llama/Llama-3.2-3B-Instruct"
MODEL_TASK: str = "text-generation" <-- task type
TEMPERATURE: float = 0.1
MAX_NEW_TOKENS: int = 1024
```
### Lint
Style the code with Ruff:

```bash
ruff format .
ruff check . --fix
```
### Deactivating the environment
When you're done working on the project, deactivate the Conda environment:

```bash
conda deactivate
```

**Last but not least:**
Locate you cache directory and remove embedding and model directory used by the project, as these may occupy several gigabytes of storage.

## Environment Configuration

### Requirements
The project includes an `requirements.txt` file that defines all required dependencies. Here's what it looks like:

```bash
accelerate==1.5.2
bitsandbytes==0.45.3
langchain-community==0.3.19
langchain-core==0.3.44
langchain-huggingface==0.1.2
langchain-qdrant==0.2.0
pypdf==5.3.1
python-dotenv==1.0.1
ruff==0.9.10
streamlit==1.43.2
torch==2.6.0+cu126
transformers==4.49.0
```

## Project Structure
```
HuggingFace_RAGFlow/
├── app/ # Streamlit app
│ ├── streamlit.py # Main application entry point
├── core/ # LLM stuff
│ ├── chain_creator/ # Files to create conversational chain and memory management
│ └── data_loader/ # Files to save embeddings to Vector Store.
│ └── model/ # LLM Model and Embeddings
│ └── retrieval/ # Vector Store Retriever
│ └── utils/ # Logging configuration
│ └── config.py # Global configuration parameters
└── README.md # This file
```

## Contact
**LinkedIn:** https://www.linkedin.com/in/erick-calderin-5bb6963b/
**e-mail:** edcm.erick@gmail.com

Just in case, feel free to create an issue 😊

## Enjoyed this content?
Explore more of my work on [Medium](https://medium.com/@erickcalderin)

I regularly share insights, tutorials, and reflections on tech, AI, and more. Your feedback and thoughts are always welcome!

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/edcalderin/huggingface_ragflow

Awesome Lists containing this project

README