Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/akhand-pratap-tiwari/automatic-extractive-text-summarization-using-tf-idf
Text Summarization using TF-IDF technique in Python.
https://github.com/akhand-pratap-tiwari/automatic-extractive-text-summarization-using-tf-idf
natural-language-processing nltk python python-3 python3 sklearn tfidf tfidf-text-analysis vectorization
Last synced: about 1 month ago
JSON representation
Text Summarization using TF-IDF technique in Python.
- Host: GitHub
- URL: https://github.com/akhand-pratap-tiwari/automatic-extractive-text-summarization-using-tf-idf
- Owner: Akhand-Pratap-Tiwari
- Created: 2022-11-17T15:14:31.000Z (about 2 years ago)
- Default Branch: master
- Last Pushed: 2024-03-23T17:54:06.000Z (10 months ago)
- Last Synced: 2024-03-23T18:42:42.316Z (10 months ago)
- Topics: natural-language-processing, nltk, python, python-3, python3, sklearn, tfidf, tfidf-text-analysis, vectorization
- Language: Jupyter Notebook
- Homepage:
- Size: 7.81 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Automatic extractive text summarization is the process of automatically creating a summary of a text document using algorithms. The most common algorithm used for this task is TF-IDF.
TF-IDF is a statistical measure that is used to evaluate how important a word is to a document. The importance of a word is determined by how often it appears in the document, and how often it appears in other documents.
The TF-IDF algorithm is used to create a vector of words that represent the importance of each word in the document. The length of the vector is the number of unique words in the document. The value of each element in the vector is the TF-IDF score of the corresponding word.
The TF-IDF algorithm is used to create a summary of a text document by selecting the most important sentences. The most important sentences are those that contain the most important words. The summary is created by selecting the sentences that contain the most important words and concatenating them.
There is only a single python file because it is that simple to implement this technique.