{"id":13478132,"url":"https://github.com/saforem2/wordplay","last_synced_at":"2025-04-30T08:23:43.258Z","repository":{"id":212788814,"uuid":"732078395","full_name":"saforem2/wordplay","owner":"saforem2","description":"Playing with words","archived":false,"fork":false,"pushed_at":"2024-11-05T14:21:43.000Z","size":14710,"stargazers_count":4,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-23T17:59:06.674Z","etag":null,"topics":["gpt","llm","pytorch"],"latest_commit_sha":null,"homepage":"https://saforem2.github.io/wordplay/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/saforem2.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-12-15T15:33:38.000Z","updated_at":"2025-03-31T23:55:59.000Z","dependencies_parsed_at":"2024-05-21T16:59:43.658Z","dependency_job_id":null,"html_url":"https://github.com/saforem2/wordplay","commit_stats":null,"previous_names":["saforem2/wordplay"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/saforem2%2Fwordplay","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/saforem2%2Fwordplay/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/saforem2%2Fwordplay/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/saforem2%2Fwordplay/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/saforem2","download_url":"https://codeload.github.com/saforem2/wordplay/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251667223,"owners_count":21624450,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gpt","llm","pytorch"],"created_at":"2024-07-31T16:01:52.918Z","updated_at":"2025-04-30T08:23:43.236Z","avatar_url":"https://github.com/saforem2.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# \u003cspan class=\"title\"\u003e`wordplay` 🎮 💬\u003c/span\u003e\nSam Foreman\n2023-12-20\n\n- [Background](#background)\n- [Completed](#completed)\n- [In Progress](#in-progress)\n- [Install](#install)\n\n\u003c!-- ::: {.quarto-title} --\u003e\n\u003c!----\u003e\n\u003c!-- ::: {.quarto-title-block} --\u003e\n\u003c!----\u003e\n\u003c!----\u003e\n\u003c!-- ::: --\u003e\n\u003c!----\u003e\n\u003c!-- ::: --\u003e\n\n*Playing with words*.\n\nA set of simple, **scalable** and *highly configurable* tools for\nworking[^1] with LLMs.\n\n## Background\n\nWhat started as some simple\n[modifications](https://github.com/saforem2/nanoGPT) to Andrej\nKarpathy's `nanoGPT` has now grown into the `wordplay` project.\n\n\u003c!-- ::: {#fig-compare gap=\"5%\" layout=\"[[40,40]]\" layout-valign=\"bottom\" style=\"text-align: center!important;\" fig-align=\"center\"} --\u003e\n\u003c!-- ::: {layout-ncol=2 gap=\"5%\" layout-valign=\"bottom\"} --\u003e\n\u003c!-- :::: {.columns layout-ncol=2 layout-valign=\"bottom\" style=\"margin-bottom: 4em;\" style=\"text-align:center\"} --\u003e\n\u003c!-- ::: {layout=\"[15,-10,15]\" layout-valign=\"bottom\"} --\u003e\n\u003c!-- :::: {#fig-compare layout-ncol=2 layout-valign=\"bottom\" style=\"display: flex; align-items: flex-end; text-align:center;\"} --\u003e\n\u003c!-- ::: {#fig-compare layout=\"[[40,-5,40]]\" layout-valign=\"center\" style=\"text-align: center;\"} --\u003e\n\u003c!----\u003e\n\u003c!-- ![`nanoGPT`](https://github.com/saforem2/wordplay/blob/main/docs/assets/nanoGPT.png?raw=true){#fig-nanoGPT} --\u003e\n\u003c!----\u003e\n\u003c!-- ![`wordplay`](https://github.com/saforem2/wordplay/blob/main/docs/assets/wordplay.png?raw=true){#fig-wordplay} --\u003e\n\u003c!----\u003e\n\u003c!-- Generated using --\u003e\n\u003c!-- [prodia/sdxl-stable-diffusion-xl](https://huggingface.co/spaces/prodia/sdxl-stable-diffusion-xl) --\u003e\n\u003c!-- on 🤗 HuggingFace. --\u003e\n\u003c!-- ::: --\u003e\n\n\u003c!--\u003cdiv id=\"fig-compare\" layout-valign=\"bottom\"\nstyle=\"display: flex; align-items: flex-end;\"\u003e\n\n\u003ctable style=\"width:100%;\"\u003e\n\u003ccolgroup\u003e\n\u003ccol style=\"width: 44%\" /\u003e\n\u003ccol style=\"width: 11%\" /\u003e\n\u003ccol style=\"width: 44%\" /\u003e\n\u003c/colgroup\u003e\n\u003ctbody\u003e\n\u003ctr class=\"odd\"\u003e\n\u003ctd style=\"text-align: center;\"\u003e\u003cdiv width=\"44.4%\"\ndata-layout-align=\"center\"\u003e\n\u003cp\u003e\u003cimg\nsrc=\"https://github.com/saforem2/wordplay/blob/main/assets/car.png?raw=true\"\nid=\"fig-nanogpt\" data-ref-parent=\"fig-compare\" data-fig.extended=\"false\"\nwidth=\"256\" alt=\"(a) nanoGPT\" /\u003e\u003c/p\u003e\n\u003c/div\u003e\u003c/td\u003e\n\u003ctd style=\"text-align: center;\"\u003e\u003cdiv class=\"quarto-figure-spacer\"\nwidth=\"11.1%\" data-layout-align=\"center\"\u003e\n\u003cp\u003e \u003c/p\u003e\n\u003c/div\u003e\u003c/td\u003e\n\u003ctd style=\"text-align: center;\"\u003e\u003cdiv width=\"44.4%\"\ndata-layout-align=\"center\"\u003e\n\u003cp\u003e\u003cimg\nsrc=\"https://github.com/saforem2/wordplay/blob/main/assets/robot.png?raw=true\"\nid=\"fig-wordplay\" data-ref-parent=\"fig-compare\"\ndata-fig.extended=\"false\" width=\"150\" alt=\"(b) wordplay\" /\u003e\u003c/p\u003e\n\u003c/div\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/tbody\u003e\n\u003c/table\u003e\n\nFigure 1: Imagine `nanoGPT`, with *all* the add-ons.\n\n\u003c/div\u003e\n--\u003e\n\n\u003cdetails closed\u003e\n\u003csummary\u003e\nIf you’re curious…\n\u003c/summary\u003e\n\nWhile `nanoGPT` is a great project and an **excellent** resource; it is,\n*by design*, very minimal[^2] and limited in its flexibility.\n\nWorking through the code I found myself making minor changes here and\nthere to test new ideas and run variations on different experiments.\nThese changes eventually built to the point where *my*\n`{goals, scope, code}` for the project had diverged significantly from\nthe original vision.\n\nAs a result, I figured it made more sense to move things to a new\nproject, [`wordplay`](https://github.com/saforem2/wordplay).\n\nI’ve priortized adding functionality that I have found to be useful or\ninteresting, but am absolutely open to input or suggestions for\nimprovement.\n\nDifferent aspects of this project have been motivated by some of my\nrecent work on LLMs.\n\n- Projects:\n  - [`ezpz`](https://github.com/saforem2/ezpz): Painless distributed\n    training with your favorite `{framework, backend}` combo.\n  - [`Megatron-DeepSpeed`](https://github.com/argonne-lcf/Megatron-DeepSpeed):\n    Ongoing research training transformer language models at scale,\n    including: BERT \u0026 GPT-2\n- Collaboration(s):\n  - **DeepSpeed4Science** (2023-09)\n    - [Loooooooong Sequence Lengths](https://samforeman.me/qmd/dsblog)\n    - [Project Website](https://www.deepspeed4science.ai/)\n    - [Preprint](https://arxiv.org/abs/2310.04610) Song et al. (2023)\n    - [Blog\n      Post](https://www.microsoft.com/en-us/research/blog/announcing-the-deepspeed4science-initiative-enabling-large-scale-scientific-discovery-through-sophisticated-ai-system-technologies/)\n    - [Tutorial](https://www.deepspeed.ai/deepspeed4science/)\n  - GenSLMs:\n    - [GitHub](https://github.com/ramanathanlab/genslm)\n    - [Preprint](https://www.biorxiv.org/content/10.1101/2022.10.10.511571v2)\n    - 🏆 [ACM Gordon Bell Special Prize for COVID-19\n      Research](https://www.acm.org/media-center/2022/november/gordon-bell-special-prize-covid-research-2022)\n- Talks / Workshops:\n  - **LLM-lunch-talk** (2023-10-12): LLMs at\n    [ALCF](https://alcf.anl.gov).\n    - [Slides](https://saforem2.github.io/llm-lunch-talk/#/section)\n    - [GitHub](https://github.com/saforem2/llm-lunch-talk)\n  - **Creating Small(-ish) LLMs** (2023-11-30)\n    - [Workshop](https://github.com/brettin/llm_tutorial/blob/main/tutorials/03-smallish-LLMs/README.md)\n    - [Slides](https://saforem2.github.io/LLM-tutorial/#/creating-small-ish-llmsslides-gh)\n    - [GitHub](https://github.com/saforem2/LLM-tutorial)\n\n\u003c/details\u003e\n\n## Completed\n\n\n- [x] [DeepSpeed](https://deepspeed.ai/) support (✅: 2024-01-03)\n- [x] Work with *any* 🤗 HuggingFace\n  [dataset](https://huggingface.co/docs/datasets/index)\n- [x] Effortless distributed training using\n  [`ezpz`](https://github.com/saforem2/ezpz)\n- [x] Improved (type-safe) and extensible configuration system (powered\n  by [`hydra`](https://hydra.cc)), see [\\#config](#config)\n- [x] Automatic, detailed experiment + metric tracking with [Weights \u0026\n  Biases](https://wandb.ai)\n  - [Example\n    Workspace](https://wandb.ai/l2hmc-qcd/WordPlay?workspace=user-saforem2)\n  - [Example\n    Run](https://wandb.ai/l2hmc-qcd/WordPlay/runs/in83cm3o/workspace?workspace=user-saforem2)\n- [x] [Rich](https://github.com/Textualize/rich) informative logging\n  with [`enrich`](https://github.com/saforem2/enrich)\n\n## In Progress\n\n- [ ] [Full-Sharded Data-Parallel\n  (FSDP)](https://pytorch.org/blog/introducing-pytorch-fully-sharded-data-parallel-api/)\n  support\n  - [Introducing PyTorch Fully Sharded Data Parallel (FSDP) API \\|\n    PyTorch](https://pytorch.org/blog/introducing-pytorch-fully-sharded-data-parallel-api/)\n- [ ] 3D Parallelism support via:\n  - [Megatron-DeepSpeed](https://github.com/argonne-lcf/Megatron-DeepSpeed)\n  - native PyTorch:\n    - [Pipeline Parallelism — PyTorch 2.1\n      documentation](https://pytorch.org/docs/stable/pipeline.html)\n    - [pytorch/PiPPy: Pipeline Parallelism for\n      PyTorch](https://github.com/pytorch/PiPPy)\n\n## Install\n\n\u003cdetails open\u003e\n\u003csummary\u003e\nGrab-n-Go\n\u003c/summary\u003e\n\nThe easiest way to get the most recent version is to:\n\n``` bash\npython3 -m pip install \"git+https://github.com/saforem2/wordplay.git\"\n```\n\n\u003c/details\u003e\n\u003cdetails closed\u003e\n\u003csummary\u003e\nDevelopment\n\u003c/summary\u003e\n\nIf you’d like to work with the project and run / change things yourself,\nI’d recommend installing from a local (editable) clone of this\nrepository:\n\n``` bash\ngit clone \"https://github.com/saforem2/wordplay\"\ncd wordplay\nmkdir v venv\npython3 -m venv venv --system-site-packages\nsource venv/bin/activate\npython3 -m pip install -e .\n```\n\n\u003c/details\u003e\n\u003c!-- # `wordplay` --\u003e\n\u003c!----\u003e\n\u003c!-- A minimal LLM implementation for research and education. --\u003e\n\u003c!-- \u0026title=visitors) --\u003e\n\u003c!-- \u0026edge_flat=false) --\u003e\n\u003c!-- \u003cp align=\"center\"\u003e --\u003e\n\u003c!-- \u003ca href=\"https://hits.seeyoufarm.com\"\u003e --\u003e\n\u003c!--     \u003cimg align=\"center\" src=\"https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fsaforem2.github.io%2Fwordplay\u0026count_bg=%2300CCFF\u0026title_bg=%23303030\u0026icon=\u0026icon_color=%23E7E7E7\u0026title=hits\u0026edge_flat=false\"/\u003e --\u003e\n\u003c!--   \u003c/a\u003e --\u003e\n\u003c!-- \u003c/p\u003e --\u003e\n\u003c!-- ## []{.pink-text} Last Updated --\u003e\n\n------------------------------------------------------------------------\n\n\u003cpre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"\u003e\u003cspan style=\"color: #7f7f7f; text-decoration-color: #7f7f7f; font-style: italic\"\u003eLast Updated\u003c/span\u003e: \u003cspan style=\"color: #f06292; text-decoration-color: #f06292; font-weight: bold\"\u003e12\u003c/span\u003e\u003cspan style=\"color: #f06292; text-decoration-color: #f06292\"\u003e/\u003c/span\u003e\u003cspan style=\"color: #f06292; text-decoration-color: #f06292; font-weight: bold\"\u003e20\u003c/span\u003e\u003cspan style=\"color: #f06292; text-decoration-color: #f06292\"\u003e/\u003c/span\u003e\u003cspan style=\"color: #f06292; text-decoration-color: #f06292; font-weight: bold\"\u003e2023\u003c/span\u003e \u003cspan style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"\u003e@\u003c/span\u003e \u003cspan style=\"color: #1a8fff; text-decoration-color: #1a8fff; font-weight: bold\"\u003e10:05:31\u003c/span\u003e\n\u003c/pre\u003e\n\n![](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fsaforem2.github.io%2Fwordplay\u0026count_bg=%23222222\u0026title_bg=%23303030\u0026icon=\u0026icon_color=%23E7E7E7)\n\n\u003cdiv id=\"refs\" class=\"references csl-bib-body hanging-indent\"\u003e\n\n\u003cdiv id=\"ref-song2023deepspeed4science\" class=\"csl-entry\"\u003e\n\nSong, Shuaiwen Leon, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang\nChen, Chengming Zhang, Masahiro Tanaka, et al. 2023. “DeepSpeed4Science\nInitiative: Enabling Large-Scale Scientific Discovery Through\nSophisticated AI System Technologies.”\n\u003chttps://arxiv.org/abs/2310.04610\u003e.\n\n\u003c/div\u003e\n\n\u003c/div\u003e\n\n[^1]:\n\n    ``` json\n    {\n      \"training\",\n      \"fine-tuning\",\n      \"benchmarking\",\n      \"parallelizing\",\n      \"distributing\",\n      \"measuring\",\n      \"...\"\n    }\n    ```\n\n    large models at scale.\n\n[^2]: `nano`, even 😂\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsaforem2%2Fwordplay","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsaforem2%2Fwordplay","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsaforem2%2Fwordplay/lists"}