Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/gowthamsiddarthademan/pdf_summarizer

This is a PDF Summarizer Web App built using Flask and Hugging Face Transformers. The application allows users to upload PDF files and generate summarized text. The backend uses pre-trained NLP models for text summarization, powered by Hugging Face's Transformers library and PyTorch.
https://github.com/gowthamsiddarthademan/pdf_summarizer

flask huggingface-transformers python pytorch summarizer webapp

Last synced: about 1 month ago
JSON representation

This is a PDF Summarizer Web App built using Flask and Hugging Face Transformers. The application allows users to upload PDF files and generate summarized text. The backend uses pre-trained NLP models for text summarization, powered by Hugging Face's Transformers library and PyTorch.

Awesome Lists containing this project

README

        

# **PDF Summarizer Web App using Hugging Face Transformers**

This is a **PDF Summarizer Web App** built using **Flask** and **Hugging Face Transformers**. The application allows users to upload PDF files and generate summarized text. The backend uses pre-trained NLP models for text summarization, powered by **Hugging Face's Transformers** library and **PyTorch**.

#### **Key Features:**

- **PDF Upload**: Users can upload PDF files.
- **Text Extraction**: The app extracts text from the PDF and processes it for summarization.
- **Text Summarization**: Uses Hugging Face pre-trained models (e.g., BART or T5) to summarize the extracted text.
- **Web Interface**: Simple and clean user interface built with HTML and CSS, making it easy for users to interact with the app.
- **Real-time Summary**: After uploading the PDF, the app provides a summarized version of the document, displaying the result below the uploaded text.

#### **Technologies Used:**

- **Flask**: Lightweight web framework for serving the application.
- **Transformers (Hugging Face)**: For text summarization using pre-trained NLP models.
- **PyTorch**: Deep learning framework used by the Hugging Face models.
- **PyPDF2**: A Python library used to extract text from uploaded PDFs.
- **HTML/CSS**: For creating the front-end interface.

#### **Installation:**

1. Install the dependencies:

```bash
Flask==2.3.2
transformers==4.35.1
torch==2.1.0
PyPDF2==3.0.0
```

2. Run the Flask app:

```bash
python app.py
```

3. Open a browser and go to `http://127.0.0.1:5000` to use the app.

#### **How It Works:**

1. The user uploads a PDF file using the web interface.
2. The app extracts the text content from the PDF.
3. The extracted text is passed through a pre-trained summarization model (e.g., BART or T5).
4. The summarized text is displayed to the user.

#### **Usage:**

- Upload any PDF document.
- The app will extract the text and generate a summary based on the content.
- Summarized text is shown below the uploaded PDF content.

#### **Note:**

- The app works best for smaller PDFs. Large PDFs may require more time for processing due to the text extraction and summarization process.