https://github.com/kyegomez/simplifiedtransformers

SimplifiedTransformer simplifies transformer block without affecting training. Skip connections, projection parameters, sequential sub-blocks, and normalization layers are removed. Experimental results confirm similar training speed and performance.
https://github.com/kyegomez/simplifiedtransformers

artificial-intelligence gpt4 machine-learning machine-learning-algorithms neural-architecture-search neural-networks swarms

Last synced: 5 months ago
JSON representation

Host: GitHub
URL: https://github.com/kyegomez/simplifiedtransformers
Owner: kyegomez
License: mit
Created: 2023-12-02T18:54:59.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2025-04-11T12:59:05.000Z (6 months ago)
Last Synced: 2025-04-15T11:46:46.284Z (6 months ago)
Topics: artificial-intelligence, gpt4, machine-learning, machine-learning-algorithms, neural-architecture-search, neural-networks, swarms
Language: Python
Homepage: https://discord.gg/GYbXvDGevY
Size: 2.16 MB
Stars: 14
Watchers: 2
Forks: 3
Open Issues: 2
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE

Awesome Lists containing this project

README

          [![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)

# SimplifiedTransformers

The author presents an implementation for Simplifying Transformer Blocks. The standard transformer blocks are complex and can lead to architecture instability. In this work, the author investigates how the standard transformer block can be simplified. Through signal propagation theory and empirical observations, the author proposes modifications that remove several components without sacrificing training speed or performance. The simplified transformers achieve the same training speed and performance as standard transformers, while being 15% faster in training throughput and using 15% fewer parameters.

# Install

```

pip3 install --upgrade simplified-transormer-torch

```

--------

## Usage

```python

import torch

from simplified_transformers.main import SimplifiedTransformers

model = SimplifiedTransformers(

    dim=4096,

    depth=6,

    heads=8,

    num_tokens=20000,

)

x = torch.randint(0, 20000, (1, 4096))

out = model(x)

print(out.shape)

```

# License

MIT

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kyegomez/simplifiedtransformers

Awesome Lists containing this project

README