{"id":26655646,"url":"https://github.com/aitechnologies-it/gpt-mini","last_synced_at":"2025-06-20T18:14:43.342Z","repository":{"id":53149686,"uuid":"520918190","full_name":"aitechnologies-it/gpt-mini","owner":"aitechnologies-it","description":"Yet another minimalistic Tensorflow (re-)re-implementation of Karpathy's Pytorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer).","archived":false,"fork":false,"pushed_at":"2022-11-18T16:14:05.000Z","size":2975,"stargazers_count":14,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-11T10:33:36.246Z","etag":null,"topics":["attention-is-all-you-need","attention-mechanism","generative-model","gpt","tensorflow","tf"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aitechnologies-it.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-08-03T14:38:46.000Z","updated_at":"2024-07-14T23:54:31.000Z","dependencies_parsed_at":"2022-09-24T23:53:14.214Z","dependency_job_id":null,"html_url":"https://github.com/aitechnologies-it/gpt-mini","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/aitechnologies-it/gpt-mini","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aitechnologies-it%2Fgpt-mini","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aitechnologies-it%2Fgpt-mini/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aitechnologies-it%2Fgpt-mini/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aitechnologies-it%2Fgpt-mini/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aitechnologies-it","download_url":"https://codeload.github.com/aitechnologies-it/gpt-mini/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aitechnologies-it%2Fgpt-mini/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260994045,"owners_count":23094283,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["attention-is-all-you-need","attention-mechanism","generative-model","gpt","tensorflow","tf"],"created_at":"2025-03-25T06:36:48.977Z","updated_at":"2025-06-20T18:14:38.310Z","avatar_url":"https://github.com/aitechnologies-it.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# gpt-mini\n\n\u003cimg src=\"dalle.png\" alt=\"A speedboat stopped by a futuristic cyborg, cyberpunk style.\" width=\"250\"\u003e\n\n##### *This image has been generated using OpenAI Dall-e 2.\n\n\u003cbr /\u003e This repository containts a minimalistic [Tensorflow](https://www.tensorflow.org/) (re-)re-implementation highly inspired to [Karpathy's minGPT](https://github.com/karpathy/minGPT) Pytorch re-implementation of the [OpenAI GPT](https://github.com/openai/gpt-2).\nThis code is intended for research and educative purposes, and should be treaded accordingly.\n\n* [gpt/](gpt) contains the actual model implementation ([gpt/modeling.py](gpt/modeling.py)) and the code for running trainings ([gpt/trainer.py](gpt/trainer.py)).\n\n## Setup\n\n```\n# Clone the repo.\ngit clone https://github.com/aitechnologies-it/gpt-mini\ncd gpt-mini\n\n# Make a python environment.\n# eg. conda, pyenv\n\n# Prepare pip.\n# conda install pip\npip install --upgrade pip\n\n# Install requirements.\npip install -r requirements.txt\n```\n\n## Examples\n\nExample python notebooks can be found in the main directory. We currently provide [play_text.ipynb](play_text.ipynb) to train (both token- and char-level) GPT to learn generate text from text provided as input. Check also [train_tokenizer.ipynb](train_tokenizer.ipynb) that shows how to train an Huggingface Tokenizer on your own data.\nAlso, we provide [play_image.ipynb](play_image.ipynb) to train the model to generate cifar-10 images in an auto-regressive (pixel-level) fashion. \n\n## Usage\n\n```python\nimport tensorflow as tf\n\nfrom gpt.modeling import (GPT1Config, GPT)\nfrom gpt.trainer import (TrainerConfig, Trainer)\n\nclass MyDataset(tf.data.Dataset):\n    def _gen_examples_from(\n        data: tf.Tensor, ...\n    ):\n        def _gen():\n            for example in data:\n                ...\n                yield ...\n        return _gen\n\n    def __new__(\n        cls, inputs: tf.Tensor, block_size: int, batch_size: int, ...\n    ):\n        dataset =  (\n            tf.data.Dataset.from_generator(\n                cls._gen_examples_from(data=inputs, ...),\n                output_signature=(\n                    tf.TensorSpec(shape=(block_size,), dtype=tf.int32),\n                    tf.TensorSpec(shape=(block_size,), dtype=tf.int32))\n                )\n                .batch(batch_size, drop_remainder=True)\n                .repeat()\n                .prefetch(tf.data.experimental.AUTOTUNE)\n                ...\n        )\n        return dataset\n\n\nconfig = GPT1Config(\n    vocab_size=128, block_size=1024,\n    n_layer=3, n_head=3, n_embd=48\n)\ntconf = TrainerConfig(\n    max_epochs=3, batch_size=64, learning_rate=0.003,\n    do_lr_decay=False, warmup_ratio=0.1, cosine_decay_alpha=0.0, weight_decay=0.0,\n    total_number_optimization_steps=total_number_optimization_steps, log_every_steps=10,\n    ckpt_path='./logs', trial_id='my_trial_id'\n)\n\nmodel = GPT(config)\n\n\ntrainer = Trainer(\n    model, dataset, total_number_optimization_steps, config=tconf\n)\n\ntrainer.train()\n\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faitechnologies-it%2Fgpt-mini","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faitechnologies-it%2Fgpt-mini","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faitechnologies-it%2Fgpt-mini/lists"}