{"id":25900471,"url":"https://github.com/renaldiangsar/pdf-summarizer-qa","last_synced_at":"2026-05-08T02:20:50.398Z","repository":{"id":279754267,"uuid":"939824578","full_name":"renaldiangsar/PDF-Summarizer-QA","owner":"renaldiangsar","description":"PDF Text Summarization and QA app built using FastAPI, Langchain, and Streamlit","archived":false,"fork":false,"pushed_at":"2025-02-27T07:51:34.000Z","size":63,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-02-27T10:43:09.314Z","etag":null,"topics":["fastapi","langchain","langchain-python","large-language-models","stramlit"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/renaldiangsar.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-27T07:02:11.000Z","updated_at":"2025-02-27T07:57:40.000Z","dependencies_parsed_at":"2025-02-27T10:43:46.167Z","dependency_job_id":"01b78bb8-e627-472b-bcf4-f3dcf5b2cde6","html_url":"https://github.com/renaldiangsar/PDF-Summarizer-QA","commit_stats":null,"previous_names":["renaldiangsar/pdf-summarizer-qa"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/renaldiangsar%2FPDF-Summarizer-QA","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/renaldiangsar%2FPDF-Summarizer-QA/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/renaldiangsar%2FPDF-Summarizer-QA/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/renaldiangsar%2FPDF-Summarizer-QA/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/renaldiangsar","download_url":"https://codeload.github.com/renaldiangsar/PDF-Summarizer-QA/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241596276,"owners_count":19988044,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fastapi","langchain","langchain-python","large-language-models","stramlit"],"created_at":"2025-03-03T02:16:57.681Z","updated_at":"2026-05-08T02:20:50.343Z","avatar_url":"https://github.com/renaldiangsar.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PDF Summarization \u0026 QA Web App\n\n## 📌 Overview\nThis project is a **FastAPI and Streamlit-based web application** that allows users to:\n- **Summarize PDF documents** using an LLM-powered summarization model.\n- **Ask questions** about the content of a PDF and receive relevant answers.\n\n## 🚀 Features\n- Upload a **PDF document** (one-time upload for both summarization and QA).\n- Generate **different summaries** every time you run summarization.\n- Perform **detailed summarization** for more insightful results.\n- Ask **questions related to the PDF** and get precise answers.\n- Uses **LangChain**, **Hugging Face embeddings**, and **FAISS** for retrieval.\n- Frontend built with **Streamlit** for a smooth user experience.\n\n---\n\n## 🛠️ Tech Stack\n- **Backend:** FastAPI, LangChain, Groq API, FAISS, Hugging Face embeddings\n- **Frontend:** Streamlit\n- **PDF Processing:** PyPDFLoader\n\n---\n\n## 🏗️ Installation \u0026 Setup\n### **Clone the Repository**\n```sh\ngit clone https://github.com/renaldiangsar/PDF-Summarizer-QA.git\ncd PDF-Summarizer-QA\n```\n\n### **Create a Virtual Environment \u0026 Install Dependencies**\n```sh\n# open command prompt and run\npython -m venv venv\nsource venv/bin/activate  # On Windows: venv\\Scripts\\activate\npip install -r requirements.txt\n```\n\n### **Run the Backend (FastAPI)**\n```sh\n# open command prompt and run\nuvicorn serve:app --reload # or just\n# python serve.py\n```\n\u003e The FastAPI server will start at `http://127.0.0.1:8000`\n\n### **Run the Frontend (Streamlit)**\n```sh\n# open command prompt and run\nstreamlit run client.py\n```\n\u003e The Streamlit app will open in your browser at `http://localhost:8501`\n\n### Don't forget to give your api in .env file\n- open .env file an set your groq and huggingface api\n\n---\n\n## 🔄 Workflow (How it Works?)\n1. **User uploads a PDF** (file is stored temporarily).\n2. **User selects:**\n   - \"Summarize\" → Calls FastAPI `/summarize/` endpoint to generate a summary.\n   - \"Ask a Question\" → Calls `/ask/` endpoint with the query to get a response.\n3. **FastAPI processes the request** using:\n   - LangChain for text processing\n   - FAISS for document retrieval (for QA)\n   - Groq / Hugging Face models for LLM responses\n4. **Response is displayed** on the Streamlit UI.\n\n---\n\n## 🛠️ Customization \u0026 Improvements\n- Modify the **summarization prompt** in `serve.py` to change summary length/detail. Because shorter summarization will run faster.\n- Adjust the **chunk size** in `RecursiveCharacterTextSplitter` for better retrieval.\n- Use a **different LLM model** (e.g., GPT-4, LLaMA, or local models) for customization.\n- If you want to do a lot of use, you can use paid Openai API.\n\n---\n\n## 📝 Future Enhancements\n- Add **multilingual support** for summarization \u0026 QA.\n- Implement **document summarization history**.\n- Support **multiple PDFs at once**.\n- Looking better option to PDF processing, because PyPDFLoader not give a optimal results for unclean/irregular pdfs\n\n---\n\n## Visual\n\u003cimg src=\"pdf-summerizer-QA-visual.jpg\" width=\"75%\"\u003e\n\n---\n\nThis is my first project in github, there are still many shortcomings. I hope i can do better in my next project. 🎉","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frenaldiangsar%2Fpdf-summarizer-qa","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frenaldiangsar%2Fpdf-summarizer-qa","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frenaldiangsar%2Fpdf-summarizer-qa/lists"}