Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/soubhatta/suvidha-foundation_ml-internship

The project model is based on enhancing text summarization with a focus on rare and infrequently used words built with CNN/DailyMail Dataset using Python Libraries and Machine Learning Algorithms
https://github.com/soubhatta/suvidha-foundation_ml-internship

Last synced: about 1 month ago
JSON representation

The project model is based on enhancing text summarization with a focus on rare and infrequently used words built with CNN/DailyMail Dataset using Python Libraries and Machine Learning Algorithms

Host: GitHub
URL: https://github.com/soubhatta/suvidha-foundation_ml-internship
Owner: soubhatta
Created: 2023-11-16T21:52:04.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2023-11-16T22:11:30.000Z (about 1 year ago)
Last Synced: 2023-11-17T23:08:02.246Z (about 1 year ago)
Language: Python
Size: 2.02 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

## Suvidha-Foundation_ML-Internship
The project model is based on enhancing text summarization with a focus on rare and infrequently used words built with CNN/DailyMail Dataset using Python Libraries and Machine Learning Algorithms

## TOPIC NAME - Summarization of Text consisting of Rare Words
In an age of burgeoning online content, automatic text summarization has become an indispensable solution. Addressing the issue of less common and seldom utilized words poses a significant challenge in text summarization. Diverse facets of extractive, abstractive, and hybrid methodologies leverage transformer models and attention mechanisms to revolutionize text summarization. The outcomes were evaluated using the renowned ROUGE metric.

## DATASET FOR EVALUATION
The dataset known as CNN / DailyMail is a comprehensive archive of news articles in the English language, encompassing over 300,000 unique journalistic pieces sourced from both CNN and the Daily Mail. Initially devised for applications in machine reading, comprehension, and abstractive question answering, the dataset has evolved to serve the purposes of both extractive and abstractive summarization tasks.

[CNN, Daily Mail Dataset](https://www.kaggle.com/datasets/gowrishankarp/newspaper-text-summarization-cnn-dailymail/data)

## FINAL RESULTS OF PROJECT
The ROGUE scores are used to calculate the performance of the model.

## JOURNAL REFERENCE
https://www.sciencedirect.com/science/article/pii/S2949719123000110