https://github.com/kyegomez/griffin

Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"
https://github.com/kyegomez/griffin

ai feedforward ml mlp neural-network pytorch recurrent-networks rnns zeta

Last synced: 5 months ago
JSON representation

Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"

Host: GitHub
URL: https://github.com/kyegomez/griffin
Owner: kyegomez
License: mit
Created: 2024-03-04T16:51:37.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-04-04T12:57:51.000Z (7 months ago)
Last Synced: 2025-04-14T13:21:29.397Z (6 months ago)
Topics: ai, feedforward, ml, mlp, neural-network, pytorch, recurrent-networks, rnns, zeta
Language: Python
Homepage: https://discord.gg/GYbXvDGevY
Size: 36.2 MB
Stars: 52
Watchers: 4
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE

Awesome Lists containing this project

README

          [![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)

# Griffin

Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models". [PAPER LINK](https://huggingface.co/papers/2402.19427)

## install

`$ pip install griffin-torch`

## usage

```python

import torch

from griffin_torch.main import Griffin

# Forward pass

x = torch.randint(0, 100, (1, 10))

# Model

model = Griffin(

    dim=512,  # Dimension of the model

    num_tokens=100,  # Number of tokens in the input

    seq_len=10,  # Length of the input sequence

    depth=8,  # Number of transformer blocks

    mlp_mult=4,  # Multiplier for the hidden dimension in the MLPs

    dropout=0.1,  # Dropout rate

)

# Forward pass

y = model(x)

print(y)

```

# License

MIT

# Citation

```

@misc{de2024griffin,

    title={Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models}, 

    author={Soham De and Samuel L. Smith and Anushan Fernando and Aleksandar Botev and George Cristian-Muraru and Albert Gu and Ruba Haroun and Leonard Berrada and Yutian Chen and Srivatsan Srinivasan and Guillaume Desjardins and Arnaud Doucet and David Budden and Yee Whye Teh and Razvan Pascanu and Nando De Freitas and Caglar Gulcehre},

    year={2024},

    eprint={2402.19427},

    archivePrefix={arXiv},

    primaryClass={cs.LG}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kyegomez/griffin

Awesome Lists containing this project

README