https://github.com/kire-github/word2vec-numpy
A Skipgram with Negative Sampling (SGNS) implementation in pure NumPy
https://github.com/kire-github/word2vec-numpy
Last synced: 2 months ago
JSON representation
A Skipgram with Negative Sampling (SGNS) implementation in pure NumPy
- Host: GitHub
- URL: https://github.com/kire-github/word2vec-numpy
- Owner: kire-github
- License: mit
- Created: 2026-03-16T11:25:26.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2026-03-18T22:09:39.000Z (4 months ago)
- Last Synced: 2026-03-19T10:54:01.261Z (4 months ago)
- Language: Python
- Size: 12.7 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# word2vec-numpy
This project implements the Word2Vec algorithm using a Skipgram model with negative sampling in NumPy without the use of an ML framework.
## Usage
The training's parameters can be adjusted in [`config.py`](./config.py).
An example of how to train the model on the [`text8`](https://mattmahoney.net/dc/textdata.html) dataset is available in [`text8_example.py`](./text8_example.py). To be able to run this, you must download and unzip the [`text8`](https://mattmahoney.net/dc/textdata.html) dataset. This can be done using:
```
wget http://mattmahoney.net/dc/text8.zip
unzip text8.zip
```
Following this, it can be run via:
```
python ./text8_example.py
```
## References & Citations
- [**The Illustrated Word2Vec**](https://jalammar.github.io/illustrated-word2vec/)
- [**word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method**](https://arxiv.org/abs/1402.3722)
- [**word2vec Parameter Learning Explained**](https://arxiv.org/abs/1411.2738)