An open API service indexing awesome lists of open source software.

https://github.com/jonad/quora_w2v

In-domain word embedding using Quora dataset.
https://github.com/jonad/quora_w2v

gensim-word2vec matplotlib numpy pandas python3

Last synced: 8 months ago
JSON representation

In-domain word embedding using Quora dataset.

Awesome Lists containing this project

README

          

# Project: Word embeddings on quora dataset.
## Project Overview
In this project, I will apply a word embeddings technique called [word2vec](https://en.wikipedia.org/wiki/Word2vec) on [quora dataset](https://www.kaggle.com/c/quora-question-pairs/data) to transform text data into numbers suitable for use in a machine learning task.

## Software and Libraries
This project requires **Python 3.6** and the following Python libraries installed:
- [NumPy](http://www.numpy.org/)
- [Pandas](http://pandas.pydata.org)
- [matplotlib](http://matplotlib.org/)
- [gensim](https://radimrehurek.com/gensim/index.html)

## Run
In a terminal or command window, navigate to the top-level project directory `quora_w2v/` (that contains this README) and run one of the following commands:

```bash
ipython notebook quoraw2v.ipynb
```
or
```bash
jupyter notebook cnn_quora.ipynb
```

This will open the Jupyter Notebook software and project file in your browser.