Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/saforem2/wordplay

Playing with words
https://github.com/saforem2/wordplay

gpt llm pytorch

Last synced: about 1 month ago
JSON representation

Playing with words

Awesome Lists containing this project

README

        

# `wordplay` 🎮 💬
Sam Foreman
2023-12-20

- [Background](#background)
- [Completed](#completed)
- [In Progress](#in-progress)
- [Install](#install)

*Playing with words*.

A set of simple, **scalable** and *highly configurable* tools for
working[^1] with LLMs.

## Background

What started as some simple
[modifications](https://github.com/saforem2/nanoGPT) to Andrej
Karpathy's `nanoGPT` has now grown into the `wordplay` project.

If you’re curious…

While `nanoGPT` is a great project and an **excellent** resource; it is,
*by design*, very minimal[^2] and limited in its flexibility.

Working through the code I found myself making minor changes here and
there to test new ideas and run variations on different experiments.
These changes eventually built to the point where *my*
`{goals, scope, code}` for the project had diverged significantly from
the original vision.

As a result, I figured it made more sense to move things to a new
project, [`wordplay`](https://github.com/saforem2/wordplay).

I’ve priortized adding functionality that I have found to be useful or
interesting, but am absolutely open to input or suggestions for
improvement.

Different aspects of this project have been motivated by some of my
recent work on LLMs.

- Projects:
- [`ezpz`](https://github.com/saforem2/ezpz): Painless distributed
training with your favorite `{framework, backend}` combo.
- [`Megatron-DeepSpeed`](https://github.com/argonne-lcf/Megatron-DeepSpeed):
Ongoing research training transformer language models at scale,
including: BERT & GPT-2
- Collaboration(s):
- **DeepSpeed4Science** (2023-09)
- [Loooooooong Sequence Lengths](https://samforeman.me/qmd/dsblog)
- [Project Website](https://www.deepspeed4science.ai/)
- [Preprint](https://arxiv.org/abs/2310.04610) Song et al. (2023)
- [Blog
Post](https://www.microsoft.com/en-us/research/blog/announcing-the-deepspeed4science-initiative-enabling-large-scale-scientific-discovery-through-sophisticated-ai-system-technologies/)
- [Tutorial](https://www.deepspeed.ai/deepspeed4science/)
- GenSLMs:
- [GitHub](https://github.com/ramanathanlab/genslm)
- [Preprint](https://www.biorxiv.org/content/10.1101/2022.10.10.511571v2)
- 🏆 [ACM Gordon Bell Special Prize for COVID-19
Research](https://www.acm.org/media-center/2022/november/gordon-bell-special-prize-covid-research-2022)
- Talks / Workshops:
- **LLM-lunch-talk** (2023-10-12): LLMs at
[ALCF](https://alcf.anl.gov).
- [Slides](https://saforem2.github.io/llm-lunch-talk/#/section)
- [GitHub](https://github.com/saforem2/llm-lunch-talk)
- **Creating Small(-ish) LLMs** (2023-11-30)
- [Workshop](https://github.com/brettin/llm_tutorial/blob/main/tutorials/03-smallish-LLMs/README.md)
- [Slides](https://saforem2.github.io/LLM-tutorial/#/creating-small-ish-llmsslides-gh)
- [GitHub](https://github.com/saforem2/LLM-tutorial)

## Completed

- [x] [DeepSpeed](https://deepspeed.ai/) support (✅: 2024-01-03)
- [x] Work with *any* 🤗 HuggingFace
[dataset](https://huggingface.co/docs/datasets/index)
- [x] Effortless distributed training using
[`ezpz`](https://github.com/saforem2/ezpz)
- [x] Improved (type-safe) and extensible configuration system (powered
by [`hydra`](https://hydra.cc)), see [\#config](#config)
- [x] Automatic, detailed experiment + metric tracking with [Weights &
Biases](https://wandb.ai)
- [Example
Workspace](https://wandb.ai/l2hmc-qcd/WordPlay?workspace=user-saforem2)
- [Example
Run](https://wandb.ai/l2hmc-qcd/WordPlay/runs/in83cm3o/workspace?workspace=user-saforem2)
- [x] [Rich](https://github.com/Textualize/rich) informative logging
with [`enrich`](https://github.com/saforem2/enrich)

## In Progress

- [ ] [Full-Sharded Data-Parallel
(FSDP)](https://pytorch.org/blog/introducing-pytorch-fully-sharded-data-parallel-api/)
support
- [Introducing PyTorch Fully Sharded Data Parallel (FSDP) API \|
PyTorch](https://pytorch.org/blog/introducing-pytorch-fully-sharded-data-parallel-api/)
- [ ] 3D Parallelism support via:
- [Megatron-DeepSpeed](https://github.com/argonne-lcf/Megatron-DeepSpeed)
- native PyTorch:
- [Pipeline Parallelism — PyTorch 2.1
documentation](https://pytorch.org/docs/stable/pipeline.html)
- [pytorch/PiPPy: Pipeline Parallelism for
PyTorch](https://github.com/pytorch/PiPPy)

## Install

Grab-n-Go

The easiest way to get the most recent version is to:

``` bash
python3 -m pip install "git+https://github.com/saforem2/wordplay.git"
```

Development

If you’d like to work with the project and run / change things yourself,
I’d recommend installing from a local (editable) clone of this
repository:

``` bash
git clone "https://github.com/saforem2/wordplay"
cd wordplay
mkdir v venv
python3 -m venv venv --system-site-packages
source venv/bin/activate
python3 -m pip install -e .
```

------------------------------------------------------------------------

Last Updated: 12/20/2023 @ 10:05:31

![](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fsaforem2.github.io%2Fwordplay&count_bg=%23222222&title_bg=%23303030&icon=&icon_color=%23E7E7E7)

Song, Shuaiwen Leon, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang
Chen, Chengming Zhang, Masahiro Tanaka, et al. 2023. “DeepSpeed4Science
Initiative: Enabling Large-Scale Scientific Discovery Through
Sophisticated AI System Technologies.”
.

[^1]:

``` json
{
"training",
"fine-tuning",
"benchmarking",
"parallelizing",
"distributing",
"measuring",
"..."
}
```

large models at scale.

[^2]: `nano`, even 😂