An open API service indexing awesome lists of open source software.

https://github.com/lukashedegaard/continual-transformers-tf

TensorFlow implementation of Continual Transformer building blocks
https://github.com/lukashedegaard/continual-transformers-tf

Last synced: 2 months ago
JSON representation

TensorFlow implementation of Continual Transformer building blocks

Awesome Lists containing this project

README

        

# Continual Transformers TensorFlow

















TensorFlow implementation of Continual Transformer building blocks, which augment regular transformer layers with the ability to compute the attention output _per token step_.

The layers are modelled on the `tf.keras.layers.MultiHeadAttention` and should work as drop-in replacements in most cases.

## Setup
Continual Transformers and its modules can be installed in in your project using:
```bash
pip install git+https://github.com/LukasHedegaard/continual-transformers-tf.git
```

## Layers
### [Continual Single-output Multi Head Attention](tests/test_co_si_mha.py)
```python
from continual_transformers_tf import CoSiMultiHeadAttention

layer = CoSiMultiHeadAttention(seq_len=10, num_heads=2, key_dim=4)
```






Fig. 1: Continual Single-Output Dot-Product Attention.
The key (K) and value (V) matrices are aggregated over time by caching the step vectors k_n and v_n in a FIFO queue. During each step, only the attention output associated with q is computed.



### [Circular Positional Embedding](tests/test_circular_embedding.py)
```python
from continual_transformers_tf import CircularPositionalEncoding

layer = CircularPositionalEncoding(max_len=10, embed_dim=4)
```






Fig. 2: Circular Positional Encoding.
At each step, a positional encoding is added in a round-robin fashion.



### [Continual Single-output Transformer Encoder](tests/test_co_si_trans_enc.py)
```python
from continual_transformers_tf import CoSiTransformerEncoder

layer = CoSiTransformerEncoder(
seq_len=10,
embed_dim=4,
num_heads=2,
ff_dim=16,
dropout_rate=0.1,
)
```

## Citation
```
@article{hedegaard2022cotrans,
title={Continual Transformers: Redundancy-Free Attention for Online Inference},
author={Lukas Hedegaard and Alexandros Iosifidis},
journal={preprint, arXiv:2201.06268},
year={2022}
}
```