{"id":15601120,"url":"https://github.com/lucidrains/adjacent-attention-network","last_synced_at":"2025-04-30T11:15:24.149Z","repository":{"id":96484930,"uuid":"320389388","full_name":"lucidrains/adjacent-attention-network","owner":"lucidrains","description":"Graph neural network message passing reframed as a Transformer with local attention","archived":false,"fork":false,"pushed_at":"2022-12-24T16:51:48.000Z","size":33,"stargazers_count":68,"open_issues_count":0,"forks_count":11,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-04-19T01:31:46.080Z","etag":null,"topics":["artificial-intelligence","attention-mechanism","deep-learning","graph-neural-networks","transformer"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lucidrains.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-12-10T21:02:46.000Z","updated_at":"2025-02-22T17:31:35.000Z","dependencies_parsed_at":"2023-03-13T16:30:46.457Z","dependency_job_id":null,"html_url":"https://github.com/lucidrains/adjacent-attention-network","commit_stats":{"total_commits":23,"total_committers":1,"mean_commits":23.0,"dds":0.0,"last_synced_commit":"de1f9832e1ce49056d3e60096acf83d7e59b051c"},"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fadjacent-attention-network","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fadjacent-attention-network/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fadjacent-attention-network/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fadjacent-attention-network/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lucidrains","download_url":"https://codeload.github.com/lucidrains/adjacent-attention-network/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251687622,"owners_count":21627601,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","attention-mechanism","deep-learning","graph-neural-networks","transformer"],"created_at":"2024-10-03T02:15:08.046Z","updated_at":"2025-04-30T11:15:24.117Z","avatar_url":"https://github.com/lucidrains.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Adjacent Attention Network\n\nAn implementation of a simple transformer that is equivalent to graph neural network where the message passing is done with multi-head attention at each successive layer. Since Graph Attention Network is already taken, I decided to name it Adjacent Attention Network instead. The design will be more transformer-centric. Instead of using the square root inverse adjacency matrix trick by Kipf and Welling, in this framework it will simply be translated to the proper attention mask at each layer.\n\nThis repository is for my own exploration into the graph neural network field. My gut tells me the transformers architecture can generalize and outperform graph neural networks.\n\n## Install\n\n```bash\n$ pip install adjacent-attention-network\n```\n\n## Usage\n\nBasically a transformers where each node pays attention to the neighbors as defined by the adjacency matrix. Complexity is O(n * max_neighbors). Max number of neighbors as defined by the adjacency matrix.\n\nThe following example will have a complexity of ~ 1024 * 100\n\n```python\nimport torch\nfrom adjacent_attention_network import AdjacentAttentionNetwork\n\nmodel = AdjacentAttentionNetwork(\n    dim = 512,\n    depth = 6,\n    heads = 4\n)\n\nadj_mat = torch.empty(1, 1024, 1024).uniform_(0, 1) \u003c 0.1\nnodes   = torch.randn(1, 1024, 512)\nmask    = torch.ones(1, 1024).bool()\n\nmodel(nodes, adj_mat, mask = mask) # (1, 1024, 512)\n```\n\nIf the number of neighbors contain outliers, then the above will lead to wasteful computation, since a lot of nodes will be doing attention on padding. You can use the following stop-gap measure to account for these outliers.\n\n```python\nimport torch\nfrom adjacent_attention_network import AdjacentAttentionNetwork\n\nmodel = AdjacentAttentionNetwork(\n    dim = 512,\n    depth = 6,\n    heads = 4,\n    num_neighbors_cutoff = 100\n).cuda()\n\nadj_mat = torch.empty(1, 1024, 1024).uniform_(0, 1).cuda() \u003c 0.1\nnodes   = torch.randn(1, 1024, 512).cuda()\nmask    = torch.ones(1, 1024).bool().cuda()\n\n# for some reason, one of the nodes is fully connected to all others\nadj_mat[:, 0] = 1.\n\nmodel(nodes, adj_mat, mask = mask) # (1, 1024, 512)\n```\n\nFor non-local attention, I've decided to use a trick from the Set Transformers paper, the \u003ca href=\"https://github.com/lucidrains/isab-pytorch\"\u003eInduced Set Attention Block (ISAB)\u003c/a\u003e. From the lens of graph neural net literature, this would be analogous as having global nodes for message passing non-locally.\n\n```python\nimport torch\nfrom adjacent_attention_network import AdjacentAttentionNetwork\n\nmodel = AdjacentAttentionNetwork(\n    dim = 512,\n    depth = 6,\n    heads = 4,\n    num_global_nodes = 5\n).cuda()\n\nadj_mat = torch.empty(1, 1024, 1024).uniform_(0, 1).cuda() \u003c 0.1\nnodes   = torch.randn(1, 1024, 512).cuda()\nmask    = torch.ones(1, 1024).bool().cuda()\n\nmodel(nodes, adj_mat, mask = mask) # (1, 1024, 512)\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucidrains%2Fadjacent-attention-network","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flucidrains%2Fadjacent-attention-network","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucidrains%2Fadjacent-attention-network/lists"}