https://github.com/jarif87/article-summarizer
Efficient Text Summarization: Generating Concise Highlights
https://github.com/jarif87/article-summarizer
blurr deep-learning fastai hugginface python textsummarization transformer
Last synced: 2 months ago
JSON representation
Efficient Text Summarization: Generating Concise Highlights
- Host: GitHub
- URL: https://github.com/jarif87/article-summarizer
- Owner: jarif87
- License: mit
- Created: 2024-02-28T11:38:37.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-08-31T13:22:36.000Z (about 1 year ago)
- Last Synced: 2025-07-09T11:04:05.831Z (3 months ago)
- Topics: blurr, deep-learning, fastai, hugginface, python, textsummarization, transformer
- Language: Jupyter Notebook
- Homepage: https://text-summarizer-z6vl.onrender.com
- Size: 5.09 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Text-Summarizer
***Utilizing Hugging Face Transformers and Blurr, I streamlined data scraping and text summarization. Three specialized models were integrated seamlessly, allowing for accurate summaries. Deployment on Hugging Face Spaces ensured easy sharing and accessibility. This project showcases the efficiency of modern NLP for succinct text summarization.***
***Article Wordcloud***
***Highlights Wordcloud***
***Most Common Words in Highlights***
***Most Common Words in Article***
***Highlights Text Length***
***Article Text Length***
# Data Collection
I've collected 8176 articles and their highlights from the Daily Mail [website](https://www.dailymail.co.uk/home/index.html). The goal is to train a text summarization model that can generate brief summaries for given articles. By utilizing advanced techniques and libraries like Hugging Face's Transformers, along with tools like Blurr and Fastai, I aim to create an efficient summarization system. The aim is to develop a solution that automatically produces concise and informative summaries, aiding readers in understanding the key points of the articles quickly.
# Model Training
***I have trained three models for text summarization: distilbart-cnn-6-6, distilbart-cnn-12-6, and facebook/bart-large-cnn. Out of these, I've deployed the distilbart-cnn-12-6 model, which is a transformer model from Hugging Face.For implementing text summarization, I utilized the Blurr library along with Fastai. All the notebooks detailing the training and deployment process for these models are available [here](https://github.com/jarif87/Text-Summarizer/tree/main/notebooks).Feel free to explore the notebooks to understand the methodologies and techniques employed in training and deploying these models for text summarization.***
# Model Performance
***distilbart-cnn-6-6***
|accuracy|precision|recall|f1|
|---|---|---|---|
|0.91|0.90|0.97|0.93|***distilbart-cnn-12-6***
|accuracy|precision|recall|f1|
|---|---|---|---|
|0.90|0.89|0.97|0.93|***facebook/bart-large-cnn***
|accuracy|precision|recall|f1|
|---|---|---|---|
|0.90|0.89|0.97|0.93|# Model Deployment
The text summarization model has been deployed on the HuggingFace Spaces Gradio App. You can access the implementation either by going to the deployment folder or directly through the provided [link](https://huggingface.co/spaces/jarif/Summarization) to the application.
# Web Deployment
***I've developed a Flask application specifically for text summarization. It's designed to take input text and generate a condensed summary. You're invited to explore the Flask branch to delve into the implementation details further. To try out the application in real-time, simply visit the live website accessible through the provided [link](https://text-summarizer-z6vl.onrender.com/).***