https://github.com/kyegomez/ct

Implementation of the attention and transformer from "Building Blocks for a Complex-Valued Transformer Architecture"
https://github.com/kyegomez/ct

artificial-intelligence attention attention-is-all-you-need attention-mechanism complex-numbers complex-signals dotproduct fourier-transform

Last synced: 2 months ago
JSON representation

Implementation of the attention and transformer from "Building Blocks for a Complex-Valued Transformer Architecture"

Host: GitHub
URL: https://github.com/kyegomez/ct
Owner: kyegomez
License: mit
Created: 2023-10-07T15:05:33.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-03-11T02:31:36.000Z (over 1 year ago)
Last Synced: 2025-04-30T23:17:49.734Z (3 months ago)
Topics: artificial-intelligence, attention, attention-is-all-you-need, attention-mechanism, complex-numbers, complex-signals, dotproduct, fourier-transform
Language: Python
Homepage: https://discord.gg/qUtxnK2NMf
Size: 2.16 MB
Stars: 9
Watchers: 2
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE

Awesome Lists containing this project

README

        [![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)

# Complex Transformer (WIP)

The open source implementation of the attention and transformer from "Building Blocks for a Complex-Valued Transformer Architecture" where they propose an an attention mechanism for complex valued signals or images such as MRI and remote sensing.

They present:

- complex valued scaled dot product attention

- complex valued layer normalization

- results show improved robustness to overfitting while maintaing performance wbhen compared to real valued transformer

## Install

`pip install complex-attn`

## Usage

```python

import torch

from ct.attention import ComplexAttention

# # Usage example

dim = 512

heads = 8

seq_len = 512

batch_size = 32

q = torch.randn(batch_size, seq_len, dim) + 1j * torch.randn(

    batch_size, seq_len, dim

)

k = torch.randn(batch_size, seq_len, dim) + 1j * torch.randn(

    batch_size, seq_len, dim

)

v = torch.randn(batch_size, seq_len, dim) + 1j * torch.randn(

    batch_size, seq_len, dim

)

attention_layer = ComplexAttention(

    dim, 

    heads, 

    qk_norm=True,

    dropout=0.1,

)

attn_output = attention_layer(q, k, v)

print("Attention Output Shape:", attn_output.shape)

```

# Architecture

- I use regular norm instead of complex norm for simplicity

# License

MIT

# Citations

```

@article{2306.09827,

Author = {Florian Eilers and Xiaoyi Jiang},

Title = {Building Blocks for a Complex-Valued Transformer Architecture},

Year = {2023},

Eprint = {arXiv:2306.09827},

Doi = {10.1109/ICASSP49357.2023.10095349},

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kyegomez/ct

Awesome Lists containing this project

README