{"id":13461652,"url":"https://github.com/jaymody/picoGPT","last_synced_at":"2025-03-24T22:34:44.250Z","repository":{"id":65790728,"uuid":"591782276","full_name":"jaymody/picoGPT","owner":"jaymody","description":"An unnecessarily tiny implementation of GPT-2 in NumPy.","archived":false,"fork":false,"pushed_at":"2023-04-24T20:05:53.000Z","size":14,"stargazers_count":3332,"open_issues_count":13,"forks_count":434,"subscribers_count":29,"default_branch":"main","last_synced_at":"2025-03-23T19:06:06.972Z","etag":null,"topics":["deep-learning","gpt","gpt-2","large-language-models","machine-learning","neural-network","nlp","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jaymody.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-01-21T21:07:13.000Z","updated_at":"2025-03-23T04:18:14.000Z","dependencies_parsed_at":"2023-05-29T19:45:39.015Z","dependency_job_id":null,"html_url":"https://github.com/jaymody/picoGPT","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jaymody%2FpicoGPT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jaymody%2FpicoGPT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jaymody%2FpicoGPT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jaymody%2FpicoGPT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jaymody","download_url":"https://codeload.github.com/jaymody/picoGPT/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245366197,"owners_count":20603438,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","gpt","gpt-2","large-language-models","machine-learning","neural-network","nlp","python"],"created_at":"2024-07-31T11:00:50.918Z","updated_at":"2025-03-24T22:34:44.224Z","avatar_url":"https://github.com/jaymody.png","language":"Python","readme":"# PicoGPT\nAccompanying blog post: [GPT in 60 Lines of Numpy](https://jaykmody.com/blog/gpt-from-scratch/)\n\n---\n\nYou've seen [openai/gpt-2](https://github.com/openai/gpt-2).\n\nYou've seen [karpathy/minGPT](https://github.com/karpathy/mingpt).\n\nYou've even seen [karpathy/nanoGPT](https://github.com/karpathy/nanogpt)!\n\nBut have you seen [picoGPT](https://github.com/jaymody/picoGPT)??!?\n\n`picoGPT` is an unnecessarily tiny and minimal implementation of [GPT-2](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) in plain [NumPy](https://numpy.org). The entire forward pass code is [40 lines of code](https://github.com/jaymody/picoGPT/blob/main/gpt2_pico.py#L3-L41).\n\npicoGPT features:\n* Fast? ❌ Nah, picoGPT is megaSLOW 🐌\n* Training code? ❌ Error, 4️⃣0️⃣4️⃣ not found\n* Batch inference? ❌ picoGPT is civilized, single file line, one at a time only\n* top-p sampling? ❌ top-k? ❌ temperature? ❌ categorical sampling?! ❌ greedy? ✅\n* Readable? `gpt2.py` ✅ `gpt2_pico.py` ❌\n* Smol??? ✅✅✅✅✅✅ YESS!!! TEENIE TINY in fact 🤏\n\nA quick breakdown of each of the files:\n\n* `encoder.py` contains the code for OpenAI's BPE Tokenizer, taken straight from their [gpt-2 repo](https://github.com/openai/gpt-2/blob/master/src/encoder.py).\n* `utils.py` contains the code to download and load the GPT-2 model weights, tokenizer, and hyper-parameters.\n* `gpt2.py` contains the actual GPT model and generation code which we can run as a python script.\n* `gpt2_pico.py` is the same as `gpt2.py`, but in even fewer lines of code. Why? Because why not 😎👍.\n\n#### Dependencies\n```bash\npip install -r requirements.txt\n```\nTested on `Python 3.9.10`.\n\n#### Usage\n```bash\npython gpt2.py \"Alan Turing theorized that computers would one day become\"\n```\n\nWhich generates\n\n```\n the most powerful machines on the planet.\n\nThe computer is a machine that can perform complex calculations, and it can perform these calculations in a way that is very similar to the human brain.\n```\n\nYou can also control the number of tokens to generate, the model size (one of `[\"124M\", \"355M\", \"774M\", \"1558M\"]`), and the directory to save the models:\n\n```bash\npython gpt2.py \\\n    \"Alan Turing theorized that computers would one day become\" \\\n    --n_tokens_to_generate 40 \\\n    --model_size \"124M\" \\\n    --models_dir \"models\"\n```\n","funding_links":[],"categories":["Python","A01_文本生成_文本对话"],"sub_categories":["大语言对话模型及数据"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjaymody%2FpicoGPT","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjaymody%2FpicoGPT","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjaymody%2FpicoGPT/lists"}