https://github.com/lukashedegaard/continual-transformers-tf

TensorFlow implementation of Continual Transformer building blocks
https://github.com/lukashedegaard/continual-transformers-tf

Last synced: 5 months ago
JSON representation

TensorFlow implementation of Continual Transformer building blocks

Host: GitHub
URL: https://github.com/lukashedegaard/continual-transformers-tf
Owner: LukasHedegaard
License: apache-2.0
Created: 2022-02-24T08:00:20.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2022-02-25T14:45:56.000Z (over 3 years ago)
Last Synced: 2025-01-06T16:14:51.041Z (6 months ago)
Language: Python
Size: 521 KB
Stars: 2
Watchers: 3
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE

Awesome Lists containing this project

README

        # Continual Transformers TensorFlow



  

  

    

  

  

  

  

    

  

  

    

  

  

    

  



TensorFlow implementation of Continual Transformer building blocks, which augment regular transformer layers with the ability to compute the attention output _per token step_.

The layers are modelled on the `tf.keras.layers.MultiHeadAttention` and should work as drop-in replacements in most cases.

## Setup

Continual Transformers and its modules can be installed in in your project using:

```bash

pip install git+https://github.com/LukasHedegaard/continual-transformers-tf.git

```

## Layers

### [Continual Single-output Multi Head Attention](tests/test_co_si_mha.py)

```python

from continual_transformers_tf import CoSiMultiHeadAttention

layer = CoSiMultiHeadAttention(seq_len=10, num_heads=2, key_dim=4)

```



  

  


  

  Fig. 1: Continual Single-Output Dot-Product Attention. 

        The key (K) and value (V) matrices are aggregated over time by caching the step vectors k_n and v_n in a FIFO queue. During each step, only the attention output associated with q is computed.

  

  




### [Circular Positional Embedding](tests/test_circular_embedding.py)

```python

from continual_transformers_tf import CircularPositionalEncoding

layer = CircularPositionalEncoding(max_len=10, embed_dim=4)

```



  

  


  

  Fig. 2: Circular Positional Encoding.

        At each step, a positional encoding is added in a round-robin fashion.

  

  




### [Continual Single-output Transformer Encoder](tests/test_co_si_trans_enc.py)

```python

from continual_transformers_tf import CoSiTransformerEncoder

layer = CoSiTransformerEncoder(

    seq_len=10,

    embed_dim=4,

    num_heads=2,

    ff_dim=16,

    dropout_rate=0.1,

)

```

## Citation   

```

@article{hedegaard2022cotrans,

  title={Continual Transformers: Redundancy-Free Attention for Online Inference},

  author={Lukas Hedegaard and Alexandros Iosifidis},

  journal={preprint, arXiv:2201.06268},

  year={2022}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/lukashedegaard/continual-transformers-tf

Awesome Lists containing this project

README