https://github.com/0xkibh/simple-nlp
A simple NLP clustering program to cluster the text using TF-IDF and Word2Vec as feature extraction and K-Means Clustering as an algorithm
https://github.com/0xkibh/simple-nlp
gensim kmeans-clustering nlp pandas python tfidf word2vec
Last synced: 5 months ago
JSON representation
A simple NLP clustering program to cluster the text using TF-IDF and Word2Vec as feature extraction and K-Means Clustering as an algorithm
- Host: GitHub
- URL: https://github.com/0xkibh/simple-nlp
- Owner: 0xkibh
- Created: 2022-11-09T16:04:02.000Z (about 3 years ago)
- Default Branch: master
- Last Pushed: 2022-11-10T15:50:52.000Z (about 3 years ago)
- Last Synced: 2025-03-28T18:51:16.990Z (9 months ago)
- Topics: gensim, kmeans-clustering, nlp, pandas, python, tfidf, word2vec
- Language: Jupyter Notebook
- Homepage:
- Size: 31.3 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: Readme.md
Awesome Lists containing this project
README
# Simple NLP Application
## Simple TF-IDF and K-Means Clustering
- This is a program to cluster the text on the basis of the word data.
- Text Preprocessing is done using basic python code
- Feature Extraction is done using TF-IDF algorithm
- And Clustering is done my K means clustering
## Simple Word2Vec and K-Means Clustering
- This is the code to cluster text similar to other one
- Here we used word2vec inplace of TF-IDF to extract the feature in vector
- Considering the limitation of TF-IDF(or BoW), word2vec seems better option
- And clustering is done same as above using KMeans