Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/akhand-pratap-tiwari/automatic-extractive-text-summarization-using-tf-idf

Text Summarization using TF-IDF technique in Python.
https://github.com/akhand-pratap-tiwari/automatic-extractive-text-summarization-using-tf-idf

natural-language-processing nltk python python-3 python3 sklearn tfidf tfidf-text-analysis vectorization

Last synced: about 1 month ago
JSON representation

Text Summarization using TF-IDF technique in Python.

Host: GitHub
URL: https://github.com/akhand-pratap-tiwari/automatic-extractive-text-summarization-using-tf-idf
Owner: Akhand-Pratap-Tiwari
Created: 2022-11-17T15:14:31.000Z (about 2 years ago)
Default Branch: master
Last Pushed: 2024-03-23T17:54:06.000Z (10 months ago)
Last Synced: 2024-03-23T18:42:42.316Z (10 months ago)
Topics: natural-language-processing, nltk, python, python-3, python3, sklearn, tfidf, tfidf-text-analysis, vectorization
Language: Jupyter Notebook
Homepage:
Size: 7.81 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

Automatic extractive text summarization is the process of automatically creating a summary of a text document using algorithms. The most common algorithm used for this task is TF-IDF.

TF-IDF is a statistical measure that is used to evaluate how important a word is to a document. The importance of a word is determined by how often it appears in the document, and how often it appears in other documents.

The TF-IDF algorithm is used to create a vector of words that represent the importance of each word in the document. The length of the vector is the number of unique words in the document. The value of each element in the vector is the TF-IDF score of the corresponding word.

The TF-IDF algorithm is used to create a summary of a text document by selecting the most important sentences. The most important sentences are those that contain the most important words. The summary is created by selecting the sentences that contain the most important words and concatenating them.

There is only a single python file because it is that simple to implement this technique.