https://github.com/memgonzales/semantle-word-embeddings
Recreation of Semantle (a word guessing game that gives the semantic similarity to the secret word) using three pretrained word embeddings: (1) word2vec, (2) GloVe, and (3) fastText
https://github.com/memgonzales/semantle-word-embeddings
dense-vector fasttext gensim glove glove-embeddings natural-language-processing natural-language-understanding nlp semantic-similarity semantics semantle word-embeddings word2vec
Last synced: 3 months ago
JSON representation
Recreation of Semantle (a word guessing game that gives the semantic similarity to the secret word) using three pretrained word embeddings: (1) word2vec, (2) GloVe, and (3) fastText
- Host: GitHub
- URL: https://github.com/memgonzales/semantle-word-embeddings
- Owner: memgonzales
- Created: 2022-12-26T09:37:25.000Z (over 2 years ago)
- Default Branch: master
- Last Pushed: 2023-01-23T16:12:13.000Z (over 2 years ago)
- Last Synced: 2025-01-20T11:11:25.445Z (5 months ago)
- Topics: dense-vector, fasttext, gensim, glove, glove-embeddings, natural-language-processing, natural-language-understanding, nlp, semantic-similarity, semantics, semantle, word-embeddings, word2vec
- Language: Jupyter Notebook
- Homepage:
- Size: 12.7 MB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Recreating Semantle with Word Embeddings
![badge][badge-jupyter]

This project attempts to recreate a version of the game [Semantle](https://semantle.com/), a variant of the five-letter word guessing game [Wordle](https://www.nytimes.com/games/wordle/index.html) that gives the semantic similarity of the player's guess to the secret word of the day. Our version of Semantle allows the player to choose from the following **pretrained word embeddings**:
- [word2vec](https://proceedings.neurips.cc/paper/2013/file/9aa42b31882ec039965f3c4923ce901b-Paper.pdf)
- [GloVe](https://aclanthology.org/D14-1162.pdf)
- [fastText](https://aclanthology.org/E17-2068.pdf)All the scripts are placed inside a [Jupyter notebook](https://github.com/memgonzales/semantle-word-embeddings/blob/master/Semantle%20Recreation.ipynb), which also includes a detailed write-up covering the following:
- Design decisions in the implementation of the program
- Walkthrough of the implementation of the program
- Comparative analysis of selected word embeddings in the context of the program
- Insights on vector semantics, including:
- Possible applications outside natural language processing (e.g., vectorizing protein sequences)
- Ethical issues and latent biases in word embeddingsThis [notebook](https://github.com/memgonzales/semantle-word-embeddings/blob/master/Semantle%20Recreation.ipynb) was created using [Google Colab](https://colab.research.google.com/) and invokes commands such as `gdown` and `wget`. The memory requirement of loading pretrained word embeddings may also be heavy for some local machines. Therefore, we recommend running the notebook on Colab.
This is a major course output in an introduction to natural language processing class under Mr. Edward P. Tighe of the Department of Software Technology, De La Salle University.
## Built Using
This project is a Jupyter notebook, with the following Python libraries and modules used:Library/Module | Description | License
-- | -- | --
[`gensim`](https://radimrehurek.com/gensim/) | Provides functions for training vector embeddings, topic modelling, document indexing, and similarity retrieval with large corpora | GNU Lesser General Public License v2.1
[`regex`](https://pypi.org/project/regex/) | Provides additional functionality over the standard [`re`](https://docs.python.org/3/library/re.html) module while maintaining backwards-compatibility | Apache License 2.0numpy
| Provides a multidimensional array object, various derived objects, and an assortment of routines for fast operations on arrays | BSD 3-Clause "New" or "Revised" License
[`io`](https://docs.python.org/3/library/io.html) | Provides Python's main facilities for dealing with various types of I/O | Python Software Foundation Licenserandom
| Provides functions for generating pseudo-random numbers with various common distributions | Python Software Foundation License*The descriptions are taken from their respective websites.*
## Authors
- Mark Edward M. Gonzales
[email protected]
[email protected]
- Hylene Jules G. Lee
[email protected]
[email protected]
- Phoebe Clare L. Ong
[email protected]
[email protected][badge-jupyter]: https://img.shields.io/badge/Jupyter-F37626.svg?&style=flat&logo=Jupyter&logoColor=white
[badge-pandas]: https://img.shields.io/badge/Pandas-2C2D72?style=flat&logo=pandas&logoColor=white
[badge-numpy]: https://img.shields.io/badge/Numpy-777BB4?style=flat&logo=numpy&logoColor=white
[badge-scipy]: https://img.shields.io/badge/SciPy-654FF0?style=flat&logo=SciPy&logoColor=white