https://github.com/jonad/quora_w2v
In-domain word embedding using Quora dataset.
https://github.com/jonad/quora_w2v
gensim-word2vec matplotlib numpy pandas python3
Last synced: 8 months ago
JSON representation
In-domain word embedding using Quora dataset.
- Host: GitHub
- URL: https://github.com/jonad/quora_w2v
- Owner: jonad
- Created: 2017-08-29T18:17:34.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2017-09-05T01:03:58.000Z (about 8 years ago)
- Last Synced: 2024-12-27T12:09:39.707Z (10 months ago)
- Topics: gensim-word2vec, matplotlib, numpy, pandas, python3
- Language: Jupyter Notebook
- Homepage:
- Size: 11.7 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Project: Word embeddings on quora dataset.
## Project Overview
In this project, I will apply a word embeddings technique called [word2vec](https://en.wikipedia.org/wiki/Word2vec) on [quora dataset](https://www.kaggle.com/c/quora-question-pairs/data) to transform text data into numbers suitable for use in a machine learning task.## Software and Libraries
This project requires **Python 3.6** and the following Python libraries installed:
- [NumPy](http://www.numpy.org/)
- [Pandas](http://pandas.pydata.org)
- [matplotlib](http://matplotlib.org/)
- [gensim](https://radimrehurek.com/gensim/index.html)## Run
In a terminal or command window, navigate to the top-level project directory `quora_w2v/` (that contains this README) and run one of the following commands:```bash
ipython notebook quoraw2v.ipynb
```
or
```bash
jupyter notebook cnn_quora.ipynb
```This will open the Jupyter Notebook software and project file in your browser.