https://github.com/oelin/hourglass-vq-vae

An Hourglass Transformer VQ-VAE architecture.
https://github.com/oelin/hourglass-vq-vae

Last synced: 3 months ago
JSON representation

An Hourglass Transformer VQ-VAE architecture.

Host: GitHub
URL: https://github.com/oelin/hourglass-vq-vae
Owner: oelin
License: mit
Created: 2024-02-15T14:46:19.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-02-15T16:04:58.000Z (over 1 year ago)
Last Synced: 2024-02-16T15:56:37.007Z (over 1 year ago)
Language: Jupyter Notebook
Size: 46.9 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # Hourglass VQ-VAE

An Hourglass Transformer VQ-VAE architecture.

## Goal

As part of the LatentLM project, a first-stage model capable of compressing very long sequences is neccessary. We achieve this by combining [Hourglass Transformer](https://arxiv.org/abs/2110.13711) with [FSQ](https://arxiv.org/abs/2309.15505) and [Contrastive Weight Tying](https://arxiv.org/abs/2309.08351) to construct an attention-only VQ-VAE architecture.

## TODO

- [x] Linear attention.

- [ ] GQA.

- [ ] FlashAttention2 with sliding window to replace linear attention.

- [ ] Attention upsampling to replace linear upsampling. 

- [ ] (Optional) experiment with adverserial losses (Hourglass VQ-GAN).

- [ ] Hyperparameter tuning.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/oelin/hourglass-vq-vae

Awesome Lists containing this project

README