Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/runtime-error786/rrf
https://github.com/runtime-error786/rrf
chromadb huggingface-transformers langchain llama3 rrf
Last synced: 12 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/runtime-error786/rrf
- Owner: runtime-error786
- Created: 2024-08-21T19:33:57.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-08-21T20:18:38.000Z (4 months ago)
- Last Synced: 2024-12-22T04:26:50.175Z (12 days ago)
- Topics: chromadb, huggingface-transformers, langchain, llama3, rrf
- Language: Jupyter Notebook
- Homepage:
- Size: 6.2 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# PDF Document QA System with Reciprocal Rank Fusion
This project is a question-answering (QA) system built using LangChain that processes PDF documents, splits them into manageable chunks, stores them in a vector store (Chroma), and retrieves the most relevant information to answer user queries. The system enhances retrieval accuracy using a Reciprocal Rank Fusion (RRF) technique, leveraging related queries and a ranking mechanism to improve the relevance of the results.
## Features
- **PDF Document Loading**: Automatically loads and processes PDF files from a specified folder.
- **Text Chunking**: Splits the document text into chunks of a manageable size for efficient retrieval.
- **Vector Store**: Stores document chunks in a vector store using embeddings generated by a sentence-transformer model.
- **Reciprocal Rank Fusion (RRF)**: Improves retrieval accuracy by generating related queries, retrieving results for each, and combining them using RRF.
- **Question-Answering**: Retrieves relevant chunks of text to answer user queries with high accuracy.