{"id":17876538,"url":"https://github.com/paradoxzw/cosattention2d","last_synced_at":"2026-04-24T12:34:13.527Z","repository":{"id":125845670,"uuid":"469116320","full_name":"ParadoxZW/CosAttention2d","owner":"ParadoxZW","description":"a 2D cosine attention module inspired by cosFormer: Rethinking Softmax in Attention(https://arxiv.org/abs/2202.08791)","archived":false,"fork":false,"pushed_at":"2022-03-13T06:48:16.000Z","size":45,"stargazers_count":3,"open_issues_count":1,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-31T10:39:46.053Z","etag":null,"topics":["cosformer","iclr2020","pytorch","transformer","vit"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ParadoxZW.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-03-12T15:07:19.000Z","updated_at":"2023-08-16T08:46:45.000Z","dependencies_parsed_at":null,"dependency_job_id":"8ed3dd34-557b-44a7-aeb7-96d95fca8519","html_url":"https://github.com/ParadoxZW/CosAttention2d","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ParadoxZW/CosAttention2d","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ParadoxZW%2FCosAttention2d","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ParadoxZW%2FCosAttention2d/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ParadoxZW%2FCosAttention2d/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ParadoxZW%2FCosAttention2d/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ParadoxZW","download_url":"https://codeload.github.com/ParadoxZW/CosAttention2d/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ParadoxZW%2FCosAttention2d/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32224225,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-24T10:26:35.452Z","status":"ssl_error","status_checked_at":"2026-04-24T10:25:27.643Z","response_time":64,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cosformer","iclr2020","pytorch","transformer","vit"],"created_at":"2024-10-28T11:32:00.683Z","updated_at":"2026-04-24T12:34:13.509Z","avatar_url":"https://github.com/ParadoxZW.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CosAttention2D\n\n## Introduction\n\nI designed a 2D cosine attention module inspired by [cosFormer: Rethinking Softmax in Attention](https://arxiv.org/abs/2202.08791).\nIt can be used to apply self-attention on grid features (for example, like how the self-attention used in the encoder of DETR) with linear time complexity. I've tested the module on some task I'm familiar with, and found that it improved the accuracy while reducing both time and space complexity, compared with the traditional self-attention module of Transformer. \n\n## More Details \n\nLike the original paper, the similarity function (i.e., attention) is defined as:\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"./figures/eq1.png\"\u003e\n\u003c/p\u003e\nFor the tokens of specific positions in the query and key, we define:\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"./figures/eq2.png\"\u003e\n\u003c/p\u003e\n\nAll the notation is the same as the original paper, except that `(i, j)` and `(k, l)` mean the 2D coordinates of the tokens in the query grid and key grid, respectively. Now we can adjust the multiplication order of `KQV` and perform the calculation in linear complexity. (P.S. I also tested the version that neglects two middle terms in the above decomposition formula, and get lower accuracy but faster computation.)\n\n## Usage\n\nYou can use the module defined in the `cos_attn2d.py` to perform the calculation described above. Note that this module contains no learnable parameters.\n\nThere is also a simple user case defined in the `cos_mhsa.py` which can be used to perform the CosAttention2d in a multi-head setting.\n\nYou are free to use and modify these scripts. Any feedback or discussion is welcome.\n\n## Acknowledgment\n\nI appreciate [performer_pytorch](https://github.com/lucidrains/performer-pytorch) and [cosFormer](https://github.com/OpenNLPLab/cosFormer) for their valuable contributions.\n\n## Citation\n\n```\n@inproceedings{\n  zhen2022cosformer,\n  title={cosFormer: Rethinking Softmax In Attention},\n  author={Zhen Qin and Weixuan Sun and Hui Deng and Dongxu Li and Yunshen Wei and Baohong Lv and Junjie Yan and Lingpeng Kong and Yiran Zhong},\n  booktitle={International Conference on Learning Representations},\n  year={2022},\n  url={https://openreview.net/forum?id=Bl8CQrx2Up4}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fparadoxzw%2Fcosattention2d","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fparadoxzw%2Fcosattention2d","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fparadoxzw%2Fcosattention2d/lists"}