{"id":24576607,"url":"https://github.com/teja-1403/textsummarization-using-pegasus-bart","last_synced_at":"2026-05-15T20:32:20.370Z","repository":{"id":273848331,"uuid":"921063606","full_name":"teja-1403/TextSummarization-Using-PEGASUS-BART","owner":"teja-1403","description":"A text summarization project leveraging the PEGASUS and BART models for news article summarization. The project compares their performance based on ROUGE and Average Precision metrics. PEGASUS was fine-tuned for this task, while BART was evaluated without fine-tuning.","archived":false,"fork":false,"pushed_at":"2025-01-23T10:46:48.000Z","size":106,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-09T01:04:34.679Z","etag":null,"topics":["bert","natural-language-processing","pegasus","python","textsummarization"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/teja-1403.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-01-23T09:14:53.000Z","updated_at":"2025-02-24T17:31:42.000Z","dependencies_parsed_at":null,"dependency_job_id":"027de669-7aff-4694-85d9-e0aa883b351b","html_url":"https://github.com/teja-1403/TextSummarization-Using-PEGASUS-BART","commit_stats":null,"previous_names":["teja-1403/textsummarization-using-pegasus-bart"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/teja-1403/TextSummarization-Using-PEGASUS-BART","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/teja-1403%2FTextSummarization-Using-PEGASUS-BART","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/teja-1403%2FTextSummarization-Using-PEGASUS-BART/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/teja-1403%2FTextSummarization-Using-PEGASUS-BART/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/teja-1403%2FTextSummarization-Using-PEGASUS-BART/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/teja-1403","download_url":"https://codeload.github.com/teja-1403/TextSummarization-Using-PEGASUS-BART/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/teja-1403%2FTextSummarization-Using-PEGASUS-BART/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33078899,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-15T20:25:35.270Z","status":"ssl_error","status_checked_at":"2026-05-15T20:25:34.732Z","response_time":103,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bert","natural-language-processing","pegasus","python","textsummarization"],"created_at":"2025-01-23T22:49:56.097Z","updated_at":"2026-05-15T20:32:20.365Z","avatar_url":"https://github.com/teja-1403.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# News Article Summarization Using PEGASUS and BART Model\n\nThis project explores and compares the capabilities of PEGASUS and BART models in summarizing news articles. While PEGASUS was fine-tuned for this specific task, BART was used without fine-tuning. The evaluation is based on ROUGE and Average Precision metrics to assess the quality and relevance of the generated summaries.\n\n---\n\n## **Introduction**\n\nText summarization is a critical task in natural language processing (NLP) that involves generating concise and coherent summaries from lengthy text. This project focuses on summarizing news articles using two state-of-the-art models:\n- **PEGASUS**: Specifically designed for abstractive summarization tasks.\n- **BART**: A versatile model capable of handling both generative and discriminative tasks.\n\nThe aim is to compare their performance and provide insights into their summarization abilities under different conditions.\n\n---\n\n## **Dataset Details**\n\n### **Dataset Description**\nThe dataset comprises **112 rows** of news articles, each containing the following fields:\n- **Sr. No**: Unique identifier for each record.\n- **Newspaper Name**: Source of the news article.\n- **Published Date**: The date the article was published.\n- **URL**: Link to the original article.\n- **Headline**: Title of the article.\n- **Content**: Full content of the news article.\n- **Human Summary**: Manually created summary of the article (used as a reference for evaluation).\n- **Category**: Domain or topic of the article (e.g., Science and Technology, National News, Business, Environment, Health).\n\n### **Dataset Link**\nYou can download the dataset [here](https://drive.google.com/file/d/1k3gjgRneahBi6umnVdErRWYv_-HMjIT7/view?usp=sharing) \n\n---\n\n## **Methodology**\n\n1. **Preprocessing**:  \n   - Cleaned and tokenized the text data.  \n   - Prepared the dataset for input to PEGASUS and BART models.  \n\n2. **Model Training**:  \n   - **PEGASUS**: Fine-tuned on the dataset to enhance summarization accuracy.  \n   - **BART**: Used the pre-trained version without fine-tuning.  \n\n3. **Evaluation Metrics**:  \n   - **ROUGE (Recall-Oriented Understudy for Gisting Evaluation)**:  \n     - ROUGE-1: Unigram overlap between generated and reference summaries.  \n     - ROUGE-2: Bigram overlap between generated and reference summaries.  \n     - ROUGE-L: Longest common subsequence overlap.  \n   - **Average Precision**: Measures the relevance of generated summaries.\n\n4. **Model Comparison**:  \n   - Compared PEGASUS and BART based on the aforementioned metrics.\n\n---\n\n## **Results**\n\n### **Evaluation Metrics**\n\n**PEGASUS (Fine-Tuned):**  \n- ROUGE-1: **0.4103**  \n- ROUGE-2: **0.2144**  \n- ROUGE-L: **0.3142**  \n- Average Precision: **0.6169**\n\n**BART (Pre-Trained):**  \n- ROUGE-1: **0.4258**  \n- ROUGE-2: **0.2063**  \n- ROUGE-L: **0.3060**  \n- Average Precision: **0.5170**\n\n---\n\n## **Conclusion**\n\n- **BART**: Demonstrated robust summarization capabilities with slightly higher ROUGE-1 and ROUGE-L scores. Its ability to generate summaries without fine-tuning highlights the strength of its pre-trained architecture.  \n- **PEGASUS**: Fine-tuning significantly improved its precision, as reflected in its higher ROUGE-2 and Average Precision scores, making it better at capturing relevant content. However, its overall performance was comparable to BART, emphasizing the importance of fine-tuning on larger, diverse datasets for further improvement.\n\n---\n\n## **Future Scope**\n\n1. **Larger Dataset**: Extend the dataset with more diverse articles to enhance model generalization.  \n2. **Additional Models**: Compare other state-of-the-art models like T5, GPT, and BERTSUM.  \n3. **Hyperparameter Optimization**: Fine-tune the learning rate and batch size for further performance improvement.  \n4. **Cross-Domain Summarization**: Apply these models to other domains, such as healthcare and research.  \n5. **Real-Time Summarization**: Optimize models for faster inference to support real-time applications.  \n\n---\n\n## **Languages used** \n\n![python-logo-only](https://github.com/user-attachments/assets/a78aa447-fe92-4892-aaed-4dd6ea761795)\n\n# \n📣 Feel free to have a look at all the files in this repository!🤗\n\n❎ In case you find issues in any of my Repositories, you can Hit Me Up [here](https://github.com/issues)! 👈\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fteja-1403%2Ftextsummarization-using-pegasus-bart","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fteja-1403%2Ftextsummarization-using-pegasus-bart","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fteja-1403%2Ftextsummarization-using-pegasus-bart/lists"}