{"id":22420554,"url":"https://github.com/eniompw/nanogptshakespeare","last_synced_at":"2025-08-01T04:32:13.199Z","repository":{"id":65505070,"uuid":"593579175","full_name":"eniompw/nanoGPTshakespeare","owner":"eniompw","description":"finetuning shakespeare on karpathy/nanoGPT","archived":false,"fork":false,"pushed_at":"2023-02-02T23:12:02.000Z","size":62,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2023-03-09T01:26:33.647Z","etag":null,"topics":["colab","colab-notebook","gpt","gpt-2","shakespeare","transformer"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/eniompw.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2023-01-26T11:10:16.000Z","updated_at":"2023-02-22T16:11:03.000Z","dependencies_parsed_at":"2023-02-14T17:15:43.613Z","dependency_job_id":null,"html_url":"https://github.com/eniompw/nanoGPTshakespeare","commit_stats":null,"previous_names":[],"tags_count":null,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eniompw%2FnanoGPTshakespeare","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eniompw%2FnanoGPTshakespeare/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eniompw%2FnanoGPTshakespeare/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eniompw%2FnanoGPTshakespeare/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/eniompw","download_url":"https://codeload.github.com/eniompw/nanoGPTshakespeare/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":228330534,"owners_count":17903089,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["colab","colab-notebook","gpt","gpt-2","shakespeare","transformer"],"created_at":"2024-12-05T16:20:14.709Z","updated_at":"2024-12-05T16:20:15.976Z","avatar_url":"https://github.com/eniompw.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# nanoGPT shakespeare\n### using Google Colab to finetune nanoGPT on shakespeare\n\n* [Based on karpathy/nanoGPT](https://github.com/karpathy/nanoGPT)\n* [Example Jupyter Notebook on Colab](https://colab.research.google.com/drive/1G97dn-Ivle2PgjH3MXjnkOHYOnxlrf79)\n* [Example Jupyter Notebook on GitHub](https://github.com/eniompw/nanoGPTshakespeare/blob/main/nanoGPTshakespeare.ipynb)\n\n\n### Train: finetune GPT on the shakespere dataset  \n`python train.py --dtype=float16 --dataset=shakespeare --compile=False --n_layer=4 --n_head=4 --n_embd=64 --block_size=64 --batch_size=8 --init_from=gpt2 --eval_interval=100 --eval_iters=100 --max_iters=300 --bias=True`\n\n`train.py` arguments explained:\n\n* colab GPU doesn't support default bfloat16\n  * `--dtype=float16`\n* colab currently uses PyTorch 1.13.1+cu116 but compile requires PyTorch 2.0\n  * `--compile=False`\n*  larger than `gpt2-medium` models run out of RAM (12.7GB) on Colab\n   *  `--init_from=gpt2-medium`\n* [\"smaller Transformer\"](https://github.com/karpathy/nanoGPT#i-only-have-a-macbook) speeds up training significantly \n  * `--n_layer=4 --n_head=4 --n_embd=64 block_size=64 --batch_size=8`\n* save model every 100 iters:\n  * `--eval_interval=100`\n* calculate val loss for every 100 iters:\n  * `--eval_iters=100`\n* stop training after 300 iters:\n  * `--max_iters=300`\n\n### Sample: view output from the saved model   \n`!cd ./nanoGPT \u0026\u0026 python sample.py --dtype=float16 --num_samples=5 --max_new_tokens=10 --start=\"to be\"`\n\n`sample.py` arguments explained:\n\n* number of seperate examples output:\n  * `--num_samples=5`\n* ~ number of words per example to output (words ~ tokens x 0.75) \n  * `--max_new_tokens=10`\n* start each output example with:\n  * `--start=\"to be\"`\n\n**Full Colab Code:**\n```\n  # download repo\n  !git clone https://github.com/karpathy/nanoGPT.git\n  \n  # install dependencies\n  pip install tiktoken transformers\n  \n  # download shakespeare dataset into ./data/shakespeare\n  !cd ./nanoGPT/data/shakespeare/ \u0026\u0026 python prepare.py\n  \n  # finetune gpt-medium with \"smaller Transformer\" on GPU, model in ./out. (300 iters seems to have lowest val loss) \n  !cd ./nanoGPT/ \u0026\u0026 python train.py --dataset=shakespeare --n_layer=4 --n_head=4 --n_embd=64 --compile=False --block_size=64 --batch_size=8 --init_from=gpt2-medium --dtype=float16 --eval_interval=100 --eval_iters=100 --max_iters=300 --bias=True\n  \n  # print 5 samples, with 10 tokens, starting with \"to be\"\n  !cd ./nanoGPT \u0026\u0026 python sample.py --dtype=float16 --num_samples=5 --max_new_tokens=10 --start=\"to be\"\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feniompw%2Fnanogptshakespeare","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Feniompw%2Fnanogptshakespeare","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feniompw%2Fnanogptshakespeare/lists"}