https://github.com/hazdzz/paramixer
The PyTorch implementation of Paramixer.
https://github.com/hazdzz/paramixer
long-range-arena pytorch transformer
Last synced: 3 months ago
JSON representation
The PyTorch implementation of Paramixer.
- Host: GitHub
- URL: https://github.com/hazdzz/paramixer
- Owner: hazdzz
- License: mit
- Created: 2024-06-11T06:06:54.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-09-01T16:55:01.000Z (10 months ago)
- Last Synced: 2025-01-31T11:34:26.386Z (5 months ago)
- Topics: long-range-arena, pytorch, transformer
- Language: Python
- Homepage:
- Size: 19.5 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Paramixer
## About
The PyTorch implementation of Paramixer from the paper [*Paramixer: Parameterizing Mixing Links in Sparse Factors Works Better than Dot-Product Self-Attention*]().## Citation
```
@inproceedings{9878955,
title = {Paramixer: Parameterizing Mixing Links in Sparse Factors Works Better than Dot-Product Self-Attention},
author = {Yu, Tong and Khalitov, Ruslan and Cheng, Lei and Yang, Zhirong},
booktitle = {2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2022},
pages = {681--690}
}
```## Datasets
1. LRA: https://mega.nz/file/tBdAyCwA#AvMIYJrkLset-Xb9ruA7fK04zZ_Jx2p7rdwrVVaTckE## Training Steps
1. Create a data folder:
```console
mkdir data
```2. Download the dataset compressed archive
```console
wget $URL
```3. Decompress the dataset compressed archive and put the contents into the data folder
```console
unzip $dataset.zip
mv $datast ./data/$datast
```4. Run the main file
```console
python $dataset_main.py --task="$task"
```## Requirements
To install requirements:
```console
pip3 install -r requirements.txt