{"id":13478419,"url":"https://github.com/openai/sparse_attention","last_synced_at":"2025-10-07T08:18:44.289Z","repository":{"id":43212965,"uuid":"181010062","full_name":"openai/sparse_attention","owner":"openai","description":"Examples of using sparse attention, as in \"Generating Long Sequences with Sparse Transformers\"","archived":false,"fork":false,"pushed_at":"2020-08-12T16:54:02.000Z","size":10,"stargazers_count":1570,"open_issues_count":12,"forks_count":191,"subscribers_count":41,"default_branch":"master","last_synced_at":"2025-05-23T11:25:44.243Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/openai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-04-12T13:06:26.000Z","updated_at":"2025-05-18T14:49:49.000Z","dependencies_parsed_at":"2022-09-10T03:02:48.353Z","dependency_job_id":null,"html_url":"https://github.com/openai/sparse_attention","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/openai/sparse_attention","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openai%2Fsparse_attention","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openai%2Fsparse_attention/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openai%2Fsparse_attention/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openai%2Fsparse_attention/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/openai","download_url":"https://codeload.github.com/openai/sparse_attention/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openai%2Fsparse_attention/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278740878,"owners_count":26037488,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-07T02:00:06.786Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T16:01:56.755Z","updated_at":"2025-10-07T08:18:44.245Z","avatar_url":"https://github.com/openai.png","language":"Python","readme":"**Status:** Archive (code is provided as-is, no updates expected)\n\n**Update August 2020:** For an example repository that achieves state-of-the-art modeling performance on CIFAR-10 using Sparse Transformers, please see https://github.com/openai/distribution_augmentation\n\n# Sparse Attention\n\nThis repository contains the sparse attention primitives used in Sparse Transformers (see [blog](https://openai.com/blog/sparse-transformer) and [paper](https://arxiv.org/abs/1904.10509)). Specifically, it includes the following:\n\n1) A faster implementation of normal attention (the upper triangle is not computed, and many operations are fused).\n2) An implementation of \"strided\" and \"fixed\" attention, as in the Sparse Transformers paper.\n3) A simple recompute decorator, which can be adapted for usage with attention.\n\nWe hope this code can further accelerate research into sparse attention.\n\nAn example Transformer implementation which is close to the version we use internally can be found at https://github.com/openai/blocksparse/blob/master/examples/transformer/enwik8.py. \n\n# Overview of kernels\nThe repository contains fused implementations of the attention operation, which takes in `Q`, `K`, `V` matrices (all of dimensionality `batch, time, dim`) representing the queries, keys, and values for a sequence. For every query element, a weighted sum of the values is returned, where the weightings are determined by the scaled matrix product of `Q` and `K^T`.\n\nThe kernels allow specification of block sparsity in the `QK^T` matrix. This means you define a pattern of 0/1s on a `[time/blocksize, time/blocksize]` matrix of blocks, and the values where it is 0 will not be computed, and not be included in the softmax calculation. Additionally, one can define \"callbacks\" on the computed blocks, which will further mask out values in any given block from the softmax (though the matrix product will still be computed for those elements). \n\nBlock sizes of `{8, 16, 32, 64}` are supported, and slight advantages in speed may be seen from using larger blocks.\n\n# Prerequisites\nFor fp32 and blocksize `32`, any NVIDIA GPU past Kepler can be used (i.e. compute capability beyond 3.5).\n\nFor fp16 and blocksize `8, 16, 32, 64`, a GPU with Tensor Cores (e.g. the V100 GPU, compute capability \u003e= 7.0) is required.\n\nThe primary dependency is the OpenAI [blocksparse](https://github.com/openai/blocksparse/) package.\n\nWith CUDA 10 and tensorflow-gpu, you can install blocksparse with `pip install blocksparse`.\n\nFor other setups, you must install blocksparse from source, and directions can be found in the [root of the repository](https://github.com/openai/blocksparse/).\n\n# Examples\n\nRun the following on a non-V100 GPU:\n```\npython attention.py\n```\n\nOn a V100 GPU:\n```\npython attention.py fp16\n```\n\n# General usage\nAn example can be found at the bottom of `attention.py`.\n\n```python\n\nfull_attn_tf = attention_impl(q, k, v, heads=4, attn_mode=\"all\", recompute=True)\nfull_attn_bs = blocksparse_attention_impl(q, k, v, heads=4, attn_mode=\"all\", recompute=True)\n\n# first step of strided attention\nlocal_attn_bs = blocksparse_attention_impl(q, k, v, heads=4, attn_mode=\"local\", local_attn_ctx=32, recompute=True)\nlocal_attn_tf = attention_impl(q, k, v, heads=4, attn_mode=\"local\", local_attn_ctx=32, recompute=True)\n\n# second step of strided attention\nstrided_attn_bs = blocksparse_attention_impl(q, k, v, heads=4, attn_mode=\"strided\", local_attn_ctx=32, recompute=True)\nstrided_attn_tf = attention_impl(q, k, v, heads=4, attn_mode=\"strided\", local_attn_ctx=32, recompute=True)\n\n# # the 'fixed' attention pattern\nfixed = blocksparse_attention_impl(q, k, v, heads=4, attn_mode=\"fixed\", local_attn_ctx=128, num_verts=4, vertsize=1, recompute=True)\n\n```\n\n# Referencing this work\n\nIf you find this helpful in your work, you can consider citing the following:\n\n```\n@article{child2019sparsetransformer,\n  title={Generating Long Sequences with Sparse Transformers},\n  author={Child, Rewon and Gray, Scott and Radford, Alec and Sutskever, Ilya},\n  journal={URL https://openai.com/blog/sparse-transformers},\n  year={2019}\n}\n```\n","funding_links":[],"categories":["Python","Transformer库与优化"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenai%2Fsparse_attention","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopenai%2Fsparse_attention","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenai%2Fsparse_attention/lists"}