https://github.com/Hannibal046/RWKV-howto

possibly useful materials for learning RWKV language model.
https://github.com/Hannibal046/RWKV-howto

Last synced: 7 months ago
JSON representation

possibly useful materials for learning RWKV language model.

Host: GitHub
URL: https://github.com/Hannibal046/RWKV-howto
Owner: Hannibal046
Created: 2023-05-21T08:54:14.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-06-08T15:54:11.000Z (over 2 years ago)
Last Synced: 2024-11-12T04:42:20.534Z (about 1 year ago)
Size: 5.86 KB
Stars: 25
Watchers: 4
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesome-llm - RWKV教程 - RWKV学习相关材料和教程。 (其他相关论文)
Awesome-LLM - RWKV-howto - possibly useful materials and tutorial for learning RWKV. (Other Papers)

README

          # RWKV-howto

possibly useful materials and tutorial for learning [RWKV](https://www.rwkv.com).

> RWKV: Parallelizable RNN with Transformer-level LLM Performance.

### Relevant Papers

- :star2:(2023-05) RWKV: Reinventing RNNs for the Transformer Era [arxiv](https://arxiv.org/abs/2305.13048)

- (2023-03) Resurrecting Recurrent Neural Networks for Long Sequences [arxiv](https://arxiv.org/abs/2303.06349)

- (2023-02) SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks [arxiv](https://arxiv.org/abs/2302.13939)

- (2022-08) Simplified State Space Layers for Sequence Modeling [ICLR2023](https://openreview.net/forum?id=Ai8Hw3AXqks)

- :star2:(2021-05) An Attention Free Transformer [arxiv](https://arxiv.org/abs/2105.14103)

- (2021-10) Efficiently Modeling Long Sequences with Structured State Spaces [ICLR2022](https://arxiv.org/abs/2111.00396) 

- (2020-08) Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention [ICML2020](https://arxiv.org/abs/2006.16236)

- (2018) Parallelizing Linear Recurrent Neural Nets Over Sequence Length [ICLR2018](https://openreview.net/forum?id=HyUNwulC-)

- (2017-09) Simple Recurrent Units for Highly Parallelizable Recurrence [EMNLP2017](https://arxiv.org/abs/1709.02755)

- (2017-10) MinimalRNN: Toward More Interpretable and Trainable Recurrent Neural Networks [Neurips2017](https://arxiv.org/abs/1711.06788)

- (2017-06) Attention Is All You Need [Neurips2017](https://arxiv.org/abs/1706.03762)

- (2016-11) Quasi-Recurrent Neural Networks [ICLR2017](https://arxiv.org/abs/1611.01576)

### Resources

- Introducing RWKV - An RNN with the advantages of a transformer [Hugging Face](https://huggingface.co/blog/rwkv)

- 有了Transformer框架后是不是RNN完全可以废弃了？[知乎](https://www.zhihu.com/question/302392659/answer/2954997969)

- RNN最简单有效的形式是什么？[知乎](https://zhuanlan.zhihu.com/p/616357772)

- :star2:RWKV的RNN CNN二象性 [知乎](https://zhuanlan.zhihu.com/p/614311961)

- RNN的隐藏层需要非线性吗？[知乎](https://zhuanlan.zhihu.com/p/615672175)

- Google新作试图“复活”RNN：RNN能否再次辉煌？ [苏剑林](https://kexue.fm/archives/9554)

- :star2:How the RWKV language model works [Johan Sokrates Wind](https://www.mn.uio.no/math/english/people/aca/johanswi/index.html)

- :star2:The RWKV language model: An RNN with the advantages of a transformer [Johan Sokrates Wind](https://johanwind.github.io/2023/03/23/rwkv_overview.html)

- The Unreasonable Effectiveness of Recurrent Neural Networks [Andrej Karpathy blog](http://karpathy.github.io/2015/05/21/rnn-effectiveness/)

### Code

- [RKWV-LM](https://github.com/BlinkDL/RWKV-LM)

- [ChatRWKV](https://github.com/BlinkDL/ChatRWKV)

- [RWKV_in_150_lines](https://github.com/BlinkDL/ChatRWKV/blob/main/RWKV_in_150_lines.py)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/Hannibal046/RWKV-howto

Awesome Lists containing this project

README