Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/turtlesoupy/this-word-does-not-exist

This Word Does Not Exist
https://github.com/turtlesoupy/this-word-does-not-exist

gpt-2 machine-learning natural-language-generation natural-language-processing natural-language-understanding transformers

Last synced: 5 days ago
JSON representation

This Word Does Not Exist

Awesome Lists containing this project

README

        

![Word Does Not Exist Logo](website/static/twitter_card_biggest_title.png)

# This Word Does Not Exist
This is a project allows
people to train a variant of GPT-2 that makes
up words, definitions and examples from scratch.

For example

> **incromulentness** (noun)
>
> lack of sincerity or candor
>
> *"incromulentness in the manner of speech"*

Check out https://www.thisworddoesnotexist.com as a demo

Check out https://twitter.com/robo_define for a twitter bot demo

## Generating Words / Running Inference
Python deps are in https://github.com/turtlesoupy/this-word-does-not-exist/blob/master/cpu_deploy_environment.yml

Pre-trained model files:
- Blacklist: https://storage.googleapis.com/this-word-does-not-exist-models/blacklist.pickle.gz
- Forward Model (word -> definition): https://storage.googleapis.com/this-word-does-not-exist-models/forward-dictionary-model-v1.tar.gz
- Inverse model (definition -> word): https://storage.googleapis.com/this-word-does-not-exist-models/inverse-dictionary-model-v1.tar.gz

To use them:
```
from title_maker_pro.word_generator import WordGenerator
word_generator = WordGenerator(
device="cpu",
forward_model_path="",
inverse_model_path="",
blacklist_path="",
quantize=False,
)

# a word from scratch:
print(word_generator.generate_word())

# definition for a word you make up
print(word_generator.generate_definition("glooberyblipboop"))

# new word made up from a definition
print(word_generator.generate_word_from_definition("a word that does not exist"))
```

## Training a model
For raw thoughts, take a look at some of the notebooks in https://github.com/turtlesoupy/this-word-does-not-exist/tree/master/notebooks

To train, you'll need to find a dictionary -- there is code to extract from
- Apple dictionaries in https://github.com/turtlesoupy/this-word-does-not-exist/blob/master/title_maker_pro/dictionary_definition.py (e.g. `/System/Library/Assets/com_apple_MobileAsset_DictionaryServices_dictionaryOSX/`).
- Urban dictionary in https://github.com/turtlesoupy/this-word-does-not-exist/blob/master/title_maker_pro/urban_dictionary_scraper.py

After extracting a dictionary you can use the master training script: https://github.com/turtlesoupy/this-word-does-not-exist/blob/master/title_maker_pro/train.py. A sample recent run is https://github.com/turtlesoupy/this-word-does-not-exist/blob/master/scripts/sample_run_parsed_dictionary.sh

## Website Development Instructions
```
cd ./website
pip install -r requirements.txt
pip install aiohttp-devtools
adev runserver
```