https://github.com/sakhileln/rope-pytorch
RoPE Playground – Rotary Positional Embeddings in PyTorch
https://github.com/sakhileln/rope-pytorch
gpt-neox llama llm ntk qwen rope transformer
Last synced: about 2 months ago
JSON representation
RoPE Playground – Rotary Positional Embeddings in PyTorch
- Host: GitHub
- URL: https://github.com/sakhileln/rope-pytorch
- Owner: sakhileln
- License: gpl-2.0
- Created: 2025-08-13T18:22:35.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2025-08-13T19:03:03.000Z (about 2 months ago)
- Last Synced: 2025-08-13T20:50:20.309Z (about 2 months ago)
- Topics: gpt-neox, llama, llm, ntk, qwen, rope, transformer
- Homepage:
- Size: 1.08 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# PyTorch RoPE
Implementation of Rotary Positional Embeddings (RoPE) — the clever trick behind positional encoding in modern transformers like LLaMA, Qwen, and GPT-NeoX.### Run locally
```bash
# Clone the repo
git clone https://github.com/sakhileln/rope-pytorch.git
cd rope-pytorch
```### Repo Structure
```bash
rope-pytorch/
├── README.md # Intro + explanation
├── rope.py # Core RoPE functions
├── transformer.py # Tiny Transformer with RoPE
├── visualize_rope.py # 2D/3D visualizations of rotation
├── train_demo.py # Train on toy data (e.g., copy task)
└── requirements.txt
```
### Possible Extras
- Support for NTK scaling (used in LLaMA 2 for longer context).
- **Benchmark**: Compare perplexity with/without RoPE on a small language modeling dataset (like TinyShakespeare).
- A notebook version for Google Colab.### Sources
- [RoPE: Rotary Positional Embeddings](https://arxiv.org/abs/2006.10029)
- [LLaMA: Pre-training Text Encoders as Discrete Transformers](https://arxiv.org/abs/2006.16236)
- [Qwen: A Simple and Efficient Transformer for Language Modeling](https://arxiv.org/abs/2006.04768)
- [GPT-NeoX: Improving Language Understanding by Generative Pre-Training](https://arxiv.org/abs/2106.03751)
- [The Illustrated Transformer](http://jalammar.github.io/illustrated-transformer/)
- [The Annotated Transformer](http://nlp.seas.harvard.edu/2018/04/03/attention.html)
- [Rotary Positional Embeddings](https://docs.pytorch.org/torchtune/stable/generated/torchtune.modules.RotaryPositionalEmbeddings.html)
- [An In-depth exploration of Rotary Position Embedding (RoPE)](https://aiexpjourney.substack.com/p/an-in-depth-exploration-of-rotary-position-embedding-rope-ac351a45c794)
- [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/pdf/1810.04805)## Contact
- Sakhile L. Ndlazi
- [LinkedIn Profile](https://www.linkedin.com/in/sakhile-)