{"id":19236588,"url":"https://github.com/fudan-zvg/soft","last_synced_at":"2025-04-06T13:11:31.024Z","repository":{"id":37768741,"uuid":"405345766","full_name":"fudan-zvg/SOFT","owner":"fudan-zvg","description":"[NeurIPS 2021 Spotlight] \u0026 [IJCV 2024] SOFT: Softmax-free Transformer with Linear Complexity","archived":false,"fork":false,"pushed_at":"2024-03-16T01:28:20.000Z","size":5310,"stargazers_count":307,"open_issues_count":5,"forks_count":25,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-03-30T11:11:14.342Z","etag":null,"topics":["linear-complexity","linear-transformer","self-attention","softmax-free","transformer"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fudan-zvg.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-09-11T10:10:33.000Z","updated_at":"2025-01-09T23:53:42.000Z","dependencies_parsed_at":"2022-06-29T14:33:47.761Z","dependency_job_id":"9edfe5ae-4498-4c13-9527-05c4ba75e581","html_url":"https://github.com/fudan-zvg/SOFT","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fudan-zvg%2FSOFT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fudan-zvg%2FSOFT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fudan-zvg%2FSOFT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fudan-zvg%2FSOFT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fudan-zvg","download_url":"https://codeload.github.com/fudan-zvg/SOFT/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247485287,"owners_count":20946398,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["linear-complexity","linear-transformer","self-attention","softmax-free","transformer"],"created_at":"2024-11-09T16:21:35.216Z","updated_at":"2025-04-06T13:11:30.986Z","avatar_url":"https://github.com/fudan-zvg.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Softmax-free Linear Transformers\n\n![image](resources/structure.png)\n\n\u003e [**SOFT: Softmax-free Transformer with Linear Complexity**](https://arxiv.org/abs/2110.11945),            \n\u003e Jiachen Lu, Jinghan Yao, Junge Zhang, Xiatian Zhu, Hang Xu, Weiguo Gao, Chunjing Xu, Tao Xiang, Li Zhang  \n\u003e **NeurIPS 2021**\n\n\u003e [**Softmax-free Linear Transformers**](https://arxiv.org/abs/2207.03341),            \n\u003e Jiachen Lu, Junge Zhang, Xiatian Zhu, Jianfeng Feng, Tao Xiang, Li Zhang  \n\u003e **IJCV 2024**\n\n\n## What's new\n1. We propose a normalized softmax-free self-attention with stronger generalizability.\n2. SOFT is now avaliable on more vision tasks (object detection and semantic segmentation).\n\n## NEWS\n- [2024/02/12] Our journal extension [Softmax-free Linear Transformer](https://arxiv.org/abs/2207.03341) is accepted by IJCV.\n- [2022/07/05] SOFT is now available for downstream tasks! An efficient normalization is applied to SOFT. Please refer to [SOFT-Norm](https://github.com/fudan-zvg/SOFT/tree/normalization)\n\n## Requirments\n* timm==0.3.2\n\n* torch\u003e=1.7.0 and torchvision that matches the PyTorch installation\n\n* cuda\u003e=10.2\n\nCompilation may be fail on cuda \u003c 10.2.  \nWe have compiled it successfully on `cuda 10.2` and `cuda 11.2`. \n\n### Data preparation\n\nDownload and extract ImageNet train and val images from http://image-net.org/.\nThe directory structure is the standard layout for the torchvision [`datasets.ImageFolder`](https://pytorch.org/docs/stable/torchvision/datasets.html#imagefolder), and the training and validation data is expected to be in the `train/` folder and `val` folder respectively:\n\n```\n/path/to/imagenet/\n  train/\n    class1/\n      img1.jpeg\n    class2/\n      img2.jpeg\n  val/\n    class1/\n      img3.jpeg\n    class/2\n      img4.jpeg\n```\n## Installation\n```shell script\ngit clone https://github.com/fudan-zvg/SOFT.git\npython -m pip install -e SOFT\n```\n\n## Main results\n### ImageNet-1K Image Classification\n\n| Model       | Resolution | Params | FLOPs | Top-1 % | Config |Pretrained Model|\n|-------------|:----------:|:------:|:-----:|:-------:|--------|--------\n| SOFT-Tiny   | 224        | 13M    | 1.9G  | 79.3    |[SOFT_Tiny.yaml](config/SOFT_Tiny.yaml), [SOFT_Tiny_cuda.yaml](config/SOFT_Tiny_cuda.yaml)|[SOFT_Tiny](https://drive.google.com/file/d/1S04DCotIOkP0DaBb8WStQ513z82qT9de/view?usp=sharing), [SOFT_Tiny_cuda](https://drive.google.com/file/d/1inDKh3Wz_2KQgGH_2ywU5H_gLKZpIz_u/view?usp=sharing)\n| SOFT-Small  | 224        | 24M    | 3.3G  | 82.2    |[SOFT_Small.yaml](config/SOFT_Small.yaml), [SOFT_Small_cuda.yaml](config/SOFT_Small_cuda.yaml)|\n| SOFT-Medium | 224        | 45M    | 7.2G  | 82.9    |[SOFT_Meidum.yaml](config/SOFT_Medium.yaml), [SOFT_Meidum_cuda.yaml](config/SOFT_Medium_cuda.yaml)|\n| SOFT-Large  | 224        | 64M    | 11.0G | 83.1    |[SOFT_Large.yaml](config/SOFT_Large.yaml), [SOFT_Large_cuda.yaml](config/SOFT_Large_cuda.yaml)|\n| SOFT-Huge   | 224        | 87M    | 16.3G | 83.3    |[SOFT_Huge.yaml](config/SOFT_Huge.yaml), [SOFT_Huge_cuda.yaml](config/SOFT_Huge_cuda.yaml)|\n| SOFT-Tiny-Norm   | 224        | 13M    | 1.9G  | 79.4    |[SOFT_Tiny_norm.yaml](config/SOFT_Tiny_norm.yaml)|[SOFT_Tiny_norm](https://drive.google.com/file/d/1Isy5b9v_4pyIXDqhKPNRq3WKH0etDlfl/view?usp=sharing)|\n| SOFT-Small-Norm  | 224        | 24M    | 3.3G  | 82.4    |[SOFT_Small_norm.yaml](config/SOFT_Small_norm.yaml)|[SOFT_Small_norm](https://drive.google.com/file/d/1OBjn7FzVdNP1Urqxq7X0yDykyPhxAAW1/view?usp=sharing)|\n| SOFT-Medium-Norm | 224        | 45M    | 7.2G  | 83.1    |[SOFT_Meidum_norm.yaml](config/SOFT_Medium_norm.yaml)|[SOFT_Medium_norm](https://drive.google.com/file/d/1K2C6daaJn3jwurWh38uvV7rexirWjuzh/view?usp=sharing)|\n| SOFT-Large-Norm  | 224        | 64M    | 11.0G | 83.3    |[SOFT_Large_norm.yaml](config/SOFT_Large_norm.yaml)|[SOFT_Large_norm](https://drive.google.com/file/d/1aRYuF_gbBGyiXUDKEcpHJmM04SdvTUdP/view?usp=sharing)|\n| SOFT-Huge-Norm   | 224        | 87M    | 16.3G | 83.4    |[SOFT_Huge_norm.yaml](config/SOFT_Huge_norm.yaml)|\n\n### COCO Object Detection (2017 val)\n| Backbone     | Method | lr schd | box mAP | mask mAP | Params |\n|-------------|:----------:|:------:|:-----:|:-------:|:--------:|\n|SOFT-Tiny-Norm | RetinaNet | 1x | 40.0 | - | 23M|\n|SOFT-Tiny-Norm | Mask R-CNN | 1x | 41.2 | 38.2 | 33M|\n|SOFT-Small-Norm | RetinaNet | 1x | 42.8 | - | 34M|\n|SOFT-Small-Norm | Mask R-CNN | 1x | 43.8 | 40.1 | 44M|\n|SOFT-Medium-Norm | RetinaNet | 1x | 44.3 | - | 55M|\n|SOFT-Medium-Norm | Mask R-CNN | 1x | 46.6 | 42.0 | 65M|\n|SOFT-Large-Norm | RetinaNet | 1x | 45.3 | - | 74M|\n|SOFT-Large-Norm | Mask R-CNN | 1x | 47.0 | 42.2 | 84M|\n\n### ADE20K Semantic Segmentation (val)\n| Backbone     | Method | Crop size| lr schd | mIoU | Params |\n|-------------|:----------:|:----------:|:------:|:-----:|:-------:|\n|SOFT-Small-Norm | UperNet |512x512| 1x | 46.2 | 54M|\n|SOFT-Medium-Norm | UperNet |512x512 | 1x | 48.0 | 76M|\n## Get Started\n\n### Train\nWe have two implementations of Gaussian Kernel: `PyTorch` version and \nthe exact form of Gaussian function implemented by `cuda`. The config file containing `cuda` is the \ncuda implementation. Both implementations yield same performance. \nPlease **install** SOFT before running the `cuda` version. \n```shell\n./dist_train.sh ${GPU_NUM} --data ${DATA_PATH} --config ${CONFIG_FILE}\n# For example, train SOFT-Tiny on Imagenet training dataset with 8 GPUs\n./dist_train.sh 8 --data ${DATA_PATH} --config config/SOFT_Tiny.yaml\n```\n\n### Test\n\n```shell\n\n./dist_train.sh ${GPU_NUM} --data ${DATA_PATH} --config ${CONFIG_FILE} --eval_checkpoint ${CHECKPOINT_FILE} --eval\n\n# For example, test SOFT-Tiny on Imagenet validation dataset with 8 GPUs\n\n./dist_train.sh 8 --data ${DATA_PATH} --config config/SOFT_Tiny.yaml --eval_checkpoint ${CHECKPOINT_FILE} --eval\n\n```\n## Reference\n\n```bibtex\n@inproceedings{SOFT,\n    title={SOFT: Softmax-free Transformer with Linear Complexity}, \n    author={Lu, Jiachen and Yao, Jinghan and Zhang, Junge and Zhu, Xiatian and Xu, Hang and Gao, Weiguo and Xu, Chunjing and Xiang, Tao and Zhang, Li},\n    booktitle={NeurIPS},\n    year={2021}\n}\n```\n\n```bibtex\n@article{Softmax,\n    title={Softmax-free Linear Transformers}, \n    author={Lu, Jiachen and Zhang, Li and Zhang, Junge and Zhu, Xiatian and Feng, Jianfeng and Xiang, Tao},\n    journal={International Journal of Coumputer Vision},\n    year={2024}\n}\n```\n\n## License\n\n[MIT](LICENSE)\n\n\n## Acknowledgement\n\nThanks to previous open-sourced repo:  \n[Detectron2](https://github.com/facebookresearch/detectron2)  \n[T2T-ViT](https://github.com/yitu-opensource/T2T-ViT)  \n[PVT](https://github.com/whai362/PVT)   \n[Nystromformer](https://github.com/mlpen/Nystromformer)   \n[pytorch-image-models](https://github.com/rwightman/pytorch-image-models)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffudan-zvg%2Fsoft","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffudan-zvg%2Fsoft","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffudan-zvg%2Fsoft/lists"}