https://github.com/headless-start/pdf-summary-app
https://github.com/headless-start/pdf-summary-app
langchain pdf
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/headless-start/pdf-summary-app
- Owner: headless-start
- License: mit
- Created: 2025-02-09T10:57:07.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2025-02-10T13:58:40.000Z (3 months ago)
- Last Synced: 2025-02-10T14:50:05.604Z (3 months ago)
- Topics: langchain, pdf
- Language: Python
- Homepage:
- Size: 1000 Bytes
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# 📄 PDF Summary Tool
## 📌 Project Overview
This project demonstrates the **loading, preprocessing, and summarization** of text from **PDF files** using **Streamlit** and **LLM APIs like (Deepseek r1,o3-mini)**. The application allows users to upload a PDF, extract its text, and generate a concise summary. The summary can be downloaded as a `.txt` file for offline use.---
## 🚀 Key Features
1. **PDF Summarization**
- Users can upload a PDF file (up to 200MB) and get the summary of the document.---
## 🔍 How It Works
1. **Upload a PDF**:
- Users upload a PDF file using the file uploader in the app.
2. **Extract Text**:
- The app extracts text from the PDF using the `PyPDF2` library.
3. **Generate Summary**:
- The extracted text is sent to the LLM API (Deepseek r1, o3-mini), which generates a summary.
4. **Display and Download**:
- The summary is displayed on the app, and users can download it as a `.txt` file.**Check Demo Here**:
[](https://pdfdeepv1.streamlit.app/)---
## 🛠 System Requirements
### Dependencies
- Python 3.8+
- Libraries: `streamlit`, `PyPDF2`, `openai`
- Hardware: CPU (GPU not required)---
## 📄 License
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.