https://github.com/lucidrains/conformer

Implementation of the convolutional module from the Conformer paper, for use in Transformers
https://github.com/lucidrains/conformer

artificial-intelligence audio deep-learning transformer

Last synced: about 1 year ago
JSON representation

Implementation of the convolutional module from the Conformer paper, for use in Transformers

Host: GitHub
URL: https://github.com/lucidrains/conformer
Owner: lucidrains
License: mit
Created: 2020-07-26T18:56:01.000Z (almost 6 years ago)
Default Branch: master
Last Pushed: 2023-05-17T21:08:21.000Z (about 3 years ago)
Last Synced: 2025-05-16T19:02:04.962Z (about 1 year ago)
Topics: artificial-intelligence, audio, deep-learning, transformer
Language: Python
Homepage:
Size: 46.9 KB
Stars: 397
Watchers: 9
Forks: 52
Open Issues: 6
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          

## Conformer

Implementation of the convolutional module from the Conformer paper, for improving the local inductive bias in Transformers.

## Install

```bash

$ pip install conformer

```

## Usage

The Conformer convolutional module, the main novelty of the paper

```python

import torch

from conformer import ConformerConvModule

layer = ConformerConvModule(

    dim = 512,

    causal = False,             # auto-regressive or not - 1d conv will be made causal with padding if so

    expansion_factor = 2,       # what multiple of the dimension to expand for the depthwise convolution

    kernel_size = 31,           # kernel size, 17 - 31 was said to be optimal

    dropout = 0.                # dropout at the very end

)

x = torch.randn(1, 1024, 512)

x = layer(x) + x

```

1 Conformer Block

```python

import torch

from conformer import ConformerBlock

block = ConformerBlock(

    dim = 512,

    dim_head = 64,

    heads = 8,

    ff_mult = 4,

    conv_expansion_factor = 2,

    conv_kernel_size = 31,

    attn_dropout = 0.,

    ff_dropout = 0.,

    conv_dropout = 0.

)

x = torch.randn(1, 1024, 512)

block(x) # (1, 1024, 512)

```

Conformer - just multiple `ConformerBlock` from above

```python

import torch

from conformer import Conformer

conformer = Conformer(

    dim = 512,

    depth = 12,          # 12 blocks

    dim_head = 64,

    heads = 8,

    ff_mult = 4,

    conv_expansion_factor = 2,

    conv_kernel_size = 31,

    attn_dropout = 0.,

    ff_dropout = 0.,

    conv_dropout = 0.

)

x = torch.randn(1, 1024, 512)

conformer(x) # (1, 1024, 512)

```

## Todo

- [ ] switch to a better relative positional encoding. shaw's is dated

- [ ] flash attention with a better RPE

## Citations

```bibtex

@misc{gulati2020conformer,

    title   = {Conformer: Convolution-augmented Transformer for Speech Recognition},

    author  = {Anmol Gulati and James Qin and Chung-Cheng Chiu and Niki Parmar and Yu Zhang and Jiahui Yu and Wei Han and Shibo Wang and Zhengdong Zhang and Yonghui Wu and Ruoming Pang},

    year    = {2020},

    eprint  = {2005.08100},

    archivePrefix = {arXiv},

    primaryClass = {eess.AS}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/lucidrains/conformer

Awesome Lists containing this project

README