Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/abhaskumarsinha/2d-continuous-skip-gram-model
A word2vec embedding, called Continuous Skip Gram Model encodes each word to a Vector depending on how closely they are related or how often they are used along.
https://github.com/abhaskumarsinha/2d-continuous-skip-gram-model
continuous-skip-gram-model encoding keras keras-neural-networks keras-tensorflow machine-learning machine-learning-algorithms skipgram tensorflow2 tutorial word2vec word2vec-model
Last synced: 7 days ago
JSON representation
A word2vec embedding, called Continuous Skip Gram Model encodes each word to a Vector depending on how closely they are related or how often they are used along.
- Host: GitHub
- URL: https://github.com/abhaskumarsinha/2d-continuous-skip-gram-model
- Owner: abhaskumarsinha
- License: mit
- Created: 2022-07-22T17:05:31.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2022-07-22T18:38:58.000Z (over 2 years ago)
- Last Synced: 2024-11-21T03:49:50.411Z (2 months ago)
- Topics: continuous-skip-gram-model, encoding, keras, keras-neural-networks, keras-tensorflow, machine-learning, machine-learning-algorithms, skipgram, tensorflow2, tutorial, word2vec, word2vec-model
- Language: Jupyter Notebook
- Homepage:
- Size: 116 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# 2D-Continuous Skip-Gram-Model
A word2vec embedding, called Continuous Skip Gram Model encodes each word to a Vector depending on how closely they are related or how often they are used along.## Dataset
We are using Asimov's - The Last Question (Nov 1956) as our dataset for Word2Vec Embedding.
## Abstract
There are multiple ways to convert lines/words of Natural Language to Vectors to feed them to pipelines to process data. Two sorts of popular ways, we know - Continuous Skip Gram Model and Continuous Bag of Words. Continuous Bag of Words is known to perform very well with large datasets involving tasks like hashing, searching, and Machine Comprehension, while we use Continuous Skip Gram Model which is known to perform very well with short stories. The mechanism of the Continuous Skip Gram Model is exactly the opposite of CBOW. Here's a small notebook demonstrating the very simple case of the Skip Gram Model with variable window size and a dataset of only one target word and one context word. Since there are only two words in total, we use *linear* or *sigmoid* activation function in the final layer instead of the original model which used *softmax* because softmax requires **atleast** two outputs to work. We get a single scalar representation or embedding which is a special case of 1D Vector and we con compare two words, based on similarity based on scalar value distance between the encoding of two numbers. Such representation is not too helpful because we can't often represent non-singular dimensional closeness in such cases. The notebook uses `tf.keras.layers.Dot(axes=1)([vec1, vec2])` which has been implemented in TensorFlow 2.0 for merging Context and Target word cases and `tf.keras.layers.Embedding(vocab_size, output_size)` for embedding the words to float numbers between -1 to 1. The notebook can easily be extended to multiple dimensions according to the original paper using Dot Layer by Keras. The whole work is based on the original papers [1, 2]. The word2vec models are a good alternative to GloVe models for embedding NLP text words to vectors and sentences to Matrices. They have been successfully applied to some of the top most states of art Models like - T4 Transformers, BiDAF Machine Comprehension networks or so [3, 4]
## Sample Results
![download](https://user-images.githubusercontent.com/31654395/180498566-86245668-53c8-4383-bef9-c8175876170d.png)
A small plot of some words in 2D Directions demonstrating closeness (positivity in skip-gram) or far-ness (negativity in skip-gram) as generated by our Model.
## Bibliography
1. Mikolov, Tomas, et al. "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781 (2013).
2. Mikolov, Tomas, et al. "Distributed representations of words and phrases and their compositionality." Advances in neural information processing systems 26 (2013).
3. Raffel, Colin, et al. "Exploring the limits of transfer learning with a unified text-to-text transformer." J. Mach. Learn. Res. 21.140 (2020): 1-67.
4. Seo, Minjoon, et al. "Bidirectional attention flow for machine comprehension." arXiv preprint arXiv:1611.01603 (2016).