https://github.com/nemat-al/qa_doc_bot
https://github.com/nemat-al/qa_doc_bot
Last synced: 7 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/nemat-al/qa_doc_bot
- Owner: nemat-al
- Created: 2024-12-12T20:25:04.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2024-12-17T07:28:06.000Z (10 months ago)
- Last Synced: 2025-02-06T12:24:28.612Z (8 months ago)
- Language: Jupyter Notebook
- Size: 4.88 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# QA_Doc_Bot
Bot in Huggingface space: https://huggingface.co/spaces/nemat-al/QA_bot
The repository is a QA Bot RAG created using LangChain and LLM to answer questions from loaded document.
Step by step:
1. Document loader `PyPDFLoader` from langchain to load the document.
2. Text Splitter `RecursiveCharacterTextSplitter` from langchain to return splitted chuncks of the document.
3. Embedding `hugging_face_embedding` to get the numerical representation of the texts to be stored in a vector store.
4. Vector database `Chroma` to save the chuncks and their embeddings and to retrive relevant content.
5. LLM `google/flan-t5-base` to be used for the Rag bot.
6. QA chain, accepts LLM and retriever object. The retriever is based on the vectore store, which needed embedding model and chuncks generated from text splitter. The text splitter needed raw text that was loaded using loader.