Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/soubhatta/suvidha-foundation_ml-internship
The project model is based on enhancing text summarization with a focus on rare and infrequently used words built with CNN/DailyMail Dataset using Python Libraries and Machine Learning Algorithms
https://github.com/soubhatta/suvidha-foundation_ml-internship
Last synced: about 1 month ago
JSON representation
The project model is based on enhancing text summarization with a focus on rare and infrequently used words built with CNN/DailyMail Dataset using Python Libraries and Machine Learning Algorithms
- Host: GitHub
- URL: https://github.com/soubhatta/suvidha-foundation_ml-internship
- Owner: soubhatta
- Created: 2023-11-16T21:52:04.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2023-11-16T22:11:30.000Z (about 1 year ago)
- Last Synced: 2023-11-17T23:08:02.246Z (about 1 year ago)
- Language: Python
- Size: 2.02 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Suvidha-Foundation_ML-Internship
The project model is based on enhancing text summarization with a focus on rare and infrequently used words built with CNN/DailyMail Dataset using Python Libraries and Machine Learning Algorithms## TOPIC NAME - Summarization of Text consisting of Rare Words
In an age of burgeoning online content, automatic text summarization has become an indispensable solution. Addressing the issue of less common and seldom utilized words poses a significant challenge in text summarization. Diverse facets of extractive, abstractive, and hybrid methodologies leverage transformer models and attention mechanisms to revolutionize text summarization. The outcomes were evaluated using the renowned ROUGE metric.## DATASET FOR EVALUATION
The dataset known as CNN / DailyMail is a comprehensive archive of news articles in the English language, encompassing over 300,000 unique journalistic pieces sourced from both CNN and the Daily Mail. Initially devised for applications in machine reading, comprehension, and abstractive question answering, the dataset has evolved to serve the purposes of both extractive and abstractive summarization tasks.[CNN, Daily Mail Dataset](https://www.kaggle.com/datasets/gowrishankarp/newspaper-text-summarization-cnn-dailymail/data)
## FINAL RESULTS OF PROJECT
The ROGUE scores are used to calculate the performance of the model.## JOURNAL REFERENCE
https://www.sciencedirect.com/science/article/pii/S2949719123000110