https://github.com/wozeparrot/tinyrwkv
tinygrad port of the RWKV large language model.
https://github.com/wozeparrot/tinyrwkv
gpt language-model large-language-models llm rnn rwkv tinygrad
Last synced: 2 months ago
JSON representation
tinygrad port of the RWKV large language model.
- Host: GitHub
- URL: https://github.com/wozeparrot/tinyrwkv
- Owner: wozeparrot
- License: apache-2.0
- Created: 2023-02-08T05:42:35.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-06-15T23:26:25.000Z (12 months ago)
- Last Synced: 2024-06-16T00:27:38.510Z (12 months ago)
- Topics: gpt, language-model, large-language-models, llm, rnn, rwkv, tinygrad
- Language: Python
- Homepage:
- Size: 1.25 MB
- Stars: 42
- Watchers: 3
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Currently being rewritten in the `rewrite` branch.
# tinyrwkv: A tinier port of RWKV-LM
A port of the [RWKV-LM](https://github.com/BlinkDL/RWKV-LM) family of large language models to the [tinygrad](https://tinygrad.org/) framework.
## Roadmap
- [x] Implement the WKV kernel as a custom function
- [ ] Implement the backwards of the WKV kernel as a custom function
- [x] Add support for the world model and tokenizer
- [x] Add support for the MIDI models
- [x] Add initial support for RWKV-5 models## Dependencies
Currently, requires tinygrad from git or just use the nix flake.
### Python
```
numpy
pydot (only for GRAPH=1)
tinygrad
tokenizers
torch (only for loading pytorch weights)
tqdm
wandb (optional during training)
```### System
```
rust (only for compiling)
clang (only for compiling)
graphviz (only for GRAPH=1)
```## Usage
Run the CLI with `python -m cli`.
Also, usable as a python package to embed in other projects. It's also possible to compile the model to portable C code and embed it that way.
```
usage: tinyrwkv-cli [-h] [--seed SEED] {pre,gen,cht,cmp,bch,ptr,gpt,tra,bpt,wkv,mus} ...CLI for tinyrwkv
positional arguments:
{pre,gen,cht,cmp,bch,ptr,gpt,tra,bpt,wkv,mus}
pre preprocess either tinyrwkv trained weights or pytorch trained weights into RNN form
gen freeform generation using the RNN mode (requires a preprocessed model using `pre`)
cht chat with a model in RNN mode (requires a preprocessed model using `pre`)
cmp compile a RNN model into c source code and a compiled executable (need to run with CLANG=1)
bch benchmark the rnn mode
ptr preprocess pytorch weights weights into GPT form for training or inference
gpt freeform generation using the GPT mode (requires a preprocessed model using `ptr`)
tra pretrain or finetune a model (if finetuning the model needs to be preprocessed with `ptr`)
bpt benchmark the gpt mode
wkv benchmark/test each wkv module
mus music generation using the RNN mode (requires a preprocessed model using `pre`)options:
-h, --help show this help message and exit
--seed SEED seed for random
```## License
See the [LICENSE](./LICENSE) and [NOTICE](./NOTICE) files.