https://github.com/soumik12345/gpt
A minimal implementation of Generative Pre-Training or GPT
https://github.com/soumik12345/gpt
deep-learning gpt-2 keras nlp openai tensorflow
Last synced: about 1 year ago
JSON representation
A minimal implementation of Generative Pre-Training or GPT
- Host: GitHub
- URL: https://github.com/soumik12345/gpt
- Owner: soumik12345
- Created: 2020-10-29T22:56:30.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2020-11-07T18:16:52.000Z (over 5 years ago)
- Last Synced: 2025-03-28T22:35:23.703Z (over 1 year ago)
- Topics: deep-learning, gpt-2, keras, nlp, openai, tensorflow
- Language: Jupyter Notebook
- Homepage: https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
- Size: 146 KB
- Stars: 3
- Watchers: 1
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# GPT (Ongoing)
Tensorflow implementation of Generative Pre-Training on GPT.

# Experiments
## Language Model

```python
from gpt.experiments.utils import init_wandb
from gpt.experiments.language_model import IMDBReviewLanguageExperiment
experiment = IMDBReviewLanguageExperiment()
init_wandb(
project_name='gpt', experiment_name='imdb_language_model',
wandb_api_key='69696969696969696969696969696969696969696'
)
experiment.build_dataset('https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz')
experiment.compile()
start_text = 'the actor was'
start_tokens = experiment.tokenize(start_text=start_text)
experiment.train(
epochs=30, start_tokens=start_tokens,
max_length=100, max_tokens=40, top_k=10,
infer_every=1, log_on_wandb=True
)
```