Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/imvision12/rag-llamaindex
Chat with your PDF files using LLM model and VectorDatabase
https://github.com/imvision12/rag-llamaindex
huggingface langchain llama2 llamaindex llm sentence-transformers vector-database
Last synced: 1 day ago
JSON representation
Chat with your PDF files using LLM model and VectorDatabase
- Host: GitHub
- URL: https://github.com/imvision12/rag-llamaindex
- Owner: IMvision12
- License: mit
- Created: 2024-04-29T15:49:44.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-05-01T10:17:49.000Z (6 months ago)
- Last Synced: 2024-10-18T21:59:00.400Z (19 days ago)
- Topics: huggingface, langchain, llama2, llamaindex, llm, sentence-transformers, vector-database
- Language: Python
- Homepage:
- Size: 18.6 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# RAG-LlamaIndex
RAG-LlamaIndex is a project aimed at leveraging RAG (Retriever, Reader, Generator) architecture along with Llama-2 and sentence transformers to create an efficient search and summarization tool for PDF documents. This tool allows users to query information from PDF files using natural language and obtain relevant answers or summaries.
# Sample Input and Output
The data utilized in this scenario were research papers on Llama and Gemma.
Query: Which GPU was used to train LLaMA-65B?
Output :``` py
Loading checkpoint shards: 100% 2/2 [01:02<00:00, 31.34s/it]
LLM Output: LLaMA-65B was trained on A100-80GB.
```Query: What is count of Embedding Parameters for gemma-2b and gemma-7b?
Output :
``` py
Loading checkpoint shards: 100% 2/2 [01:02<00:00, 31.34s/it]
According to the context information, the count of embedding parameters for gemma-2b is 524,550,144, and for gemma-7b, it is 786,825,216.
```
# Setup 💻1. Clone Github Repo:
```bash
$ git clone https://github.com/IMvision12/RAG-LlamaIndex.git
```
```bash
$ cd RAG-LlamaIndex
```2. Install Libraries
```bash
$ pip install -r requirements.txt
```3. Get PDF data
The provided links will download pdf files, which will then be stored in a folder named "data". If you have your own PDF files, please relocate them to the "data" folder.
```bash
$ python utils.py --links https://arxiv.org/pdf/2302.13971 https://arxiv.org/pdf/2403.08295
```4. Run Main.py
```bash
$python main.py --data-directory "/content/RAG-LlamaIndex/data" \
--llm-model "meta-llama/Llama-2-7b-chat-hf" \
--embed-model "sentence-transformers/all-mpnet-base-v2" \
--hf-api "Your HuggingFace Access Token" \
--query "Enter your Query!"
```