An open API service indexing awesome lists of open source software.

https://github.com/rajspeaks/deep-learning-approach-to-english-corpus-text-visualization-using-word2vec-model

English Corpus Text-Visualization using Word2Vec Model from Gensim. A mini project under the mentorship of Prof. Sandipan Ganguly, HIT-K.
https://github.com/rajspeaks/deep-learning-approach-to-english-corpus-text-visualization-using-word2vec-model

gensim gensim-library gensim-word2vec machine-learning ml natural-language-processing natural-language-understanding nlp nlp-machine-learning rajdeep-das rajspeaks text-mining visualization word2vec word2vec-algorithm word2vec-model

Last synced: 6 months ago
JSON representation

English Corpus Text-Visualization using Word2Vec Model from Gensim. A mini project under the mentorship of Prof. Sandipan Ganguly, HIT-K.

Awesome Lists containing this project

README

          

# English Corpus Text Visualization using Word2Vec Model

Machine Learning approach to English Corpus Text-visualization using Word2Vec Model from Gensim Library in NLP.
This project was done to test the accuracy of the Word2Vec Model on English Corpus.

## Library requirements:

1. Sklearn: Used for data preprocessing, model selection, classification, Regression, clustering.
2. Matplotlib: It's used for 2D or 3D plotting to show Histogram, Bar-Chart etc
3. Gensim: Open Source Library used in Text Analysis, Word2Vec, Doc2Vec.
4. Used Melon Honey font & sample texts are collected from the Internet.

## Word2Vec

Word2Vec model is used in word embedding. I have used here Gensim library & Matplotlib-pyplot for 2d visualization of corpus.

## Methodology:

1. First I took an English Corpus applied punctuation remover.
2. Splitted the data & visualized the corpus using.
3. Repeated the Process taking larger corpus.

## Tools:
1. Google Colab/Jupyter Notebook
2. Language: Python
3. Word2Vec from Gensim
4. Matplotlib | Plyplot

### Mentor
Prof. Sandipan Ganguly, HIT-K.

### Developer
Rajdeep Das

### Thank you