https://github.com/kid-sid/chat_with_pdf
This app can take PDFs from users and based on the PDF texts it can answer user' query from those context.
https://github.com/kid-sid/chat_with_pdf
faiss-vector-database langchain openai-api python rag sqlite-database
Last synced: about 1 year ago
JSON representation
This app can take PDFs from users and based on the PDF texts it can answer user' query from those context.
- Host: GitHub
- URL: https://github.com/kid-sid/chat_with_pdf
- Owner: kid-sid
- License: mit
- Created: 2024-11-30T17:59:01.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-23T10:00:33.000Z (about 1 year ago)
- Last Synced: 2025-02-23T11:18:06.851Z (about 1 year ago)
- Topics: faiss-vector-database, langchain, openai-api, python, rag, sqlite-database
- Language: Python
- Homepage:
- Size: 540 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Chat with your own PDF
You must have came across [chatpdf](www.chatpdf.com) where you can upload your pdf and ask questions to get answers from your PDF file. This repo contains the implementation of a chat bot like chatpdf.com using streamlit as the frontend.
This code contains RAG architecture which takes your PDF texts as the context and provides answers to your queries.
## **Overview of the architecture**

**Architecture:**
Retrieval part of the model helps to retrieve information from the vector store.
Augmented part is the augmentation of the data by adding our own custom data by increasing the knowledge base of the LLM.
Generation part is used for text completion by using the next word prediction capability of the LLM.
All the RAG apps follow more or less same type of architure as above diagram.
Here we are uploading a PDF as the step one.
Then the contents/texts of the PDF are getting extracted from it as the second step.
LLMs can't take all the texts at once due to their context length limitation, so we created chunks out of those text.
Then we created vectors of those chunks and stored it in a vector database, in our case it is FAISS.
Then whenever someone asks a question, LLM answers considering vectors stored in the database as the context.
**Credentials screenshot:**

New users need to Signup and then do a login.
**Question answering interface:**

Show Query History option will show all the previous queries asked by a user and their answers.
We can use Browse File option for uploading the PDF.
Below text box is for adding your quer and press enter to get the response.