https://github.com/aitechnologies-it/gpt-mini

Yet another minimalistic Tensorflow (re-)re-implementation of Karpathy's Pytorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer).
https://github.com/aitechnologies-it/gpt-mini

attention-is-all-you-need attention-mechanism generative-model gpt tensorflow tf

Last synced: 4 months ago
JSON representation

Yet another minimalistic Tensorflow (re-)re-implementation of Karpathy's Pytorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer).

Host: GitHub
URL: https://github.com/aitechnologies-it/gpt-mini
Owner: aitechnologies-it
License: mit
Created: 2022-08-03T14:38:46.000Z (about 3 years ago)
Default Branch: main
Last Pushed: 2022-11-18T16:14:05.000Z (almost 3 years ago)
Last Synced: 2025-04-11T10:33:36.246Z (6 months ago)
Topics: attention-is-all-you-need, attention-mechanism, generative-model, gpt, tensorflow, tf
Language: Jupyter Notebook
Homepage:
Size: 2.84 MB
Stars: 14
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # gpt-mini



##### *This image has been generated using OpenAI Dall-e 2.


 This repository containts a minimalistic [Tensorflow](https://www.tensorflow.org/) (re-)re-implementation highly inspired to [Karpathy's minGPT](https://github.com/karpathy/minGPT) Pytorch re-implementation of the [OpenAI GPT](https://github.com/openai/gpt-2).

This code is intended for research and educative purposes, and should be treaded accordingly.

* [gpt/](gpt) contains the actual model implementation ([gpt/modeling.py](gpt/modeling.py)) and the code for running trainings ([gpt/trainer.py](gpt/trainer.py)).

## Setup

```

# Clone the repo.

git clone https://github.com/aitechnologies-it/gpt-mini

cd gpt-mini

# Make a python environment.

# eg. conda, pyenv

# Prepare pip.

# conda install pip

pip install --upgrade pip

# Install requirements.

pip install -r requirements.txt

```

## Examples

Example python notebooks can be found in the main directory. We currently provide [play_text.ipynb](play_text.ipynb) to train (both token- and char-level) GPT to learn generate text from text provided as input. Check also [train_tokenizer.ipynb](train_tokenizer.ipynb) that shows how to train an Huggingface Tokenizer on your own data.

Also, we provide [play_image.ipynb](play_image.ipynb) to train the model to generate cifar-10 images in an auto-regressive (pixel-level) fashion. 

## Usage

```python

import tensorflow as tf

from gpt.modeling import (GPT1Config, GPT)

from gpt.trainer import (TrainerConfig, Trainer)

class MyDataset(tf.data.Dataset):

    def _gen_examples_from(

        data: tf.Tensor, ...

    ):

        def _gen():

            for example in data:

                ...

                yield ...

        return _gen

    def __new__(

        cls, inputs: tf.Tensor, block_size: int, batch_size: int, ...

    ):

        dataset =  (

            tf.data.Dataset.from_generator(

                cls._gen_examples_from(data=inputs, ...),

                output_signature=(

                    tf.TensorSpec(shape=(block_size,), dtype=tf.int32),

                    tf.TensorSpec(shape=(block_size,), dtype=tf.int32))

                )

                .batch(batch_size, drop_remainder=True)

                .repeat()

                .prefetch(tf.data.experimental.AUTOTUNE)

                ...

        )

        return dataset

config = GPT1Config(

    vocab_size=128, block_size=1024,

    n_layer=3, n_head=3, n_embd=48

)

tconf = TrainerConfig(

    max_epochs=3, batch_size=64, learning_rate=0.003,

    do_lr_decay=False, warmup_ratio=0.1, cosine_decay_alpha=0.0, weight_decay=0.0,

    total_number_optimization_steps=total_number_optimization_steps, log_every_steps=10,

    ckpt_path='./logs', trial_id='my_trial_id'

)

model = GPT(config)

trainer = Trainer(

    model, dataset, total_number_optimization_steps, config=tconf

)

trainer.train()

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/aitechnologies-it/gpt-mini

Awesome Lists containing this project

README