{"id":13443811,"url":"https://github.com/lucidrains/perceiver-pytorch","last_synced_at":"2025-05-15T10:07:29.919Z","repository":{"id":41065181,"uuid":"344676740","full_name":"lucidrains/perceiver-pytorch","owner":"lucidrains","description":"Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch","archived":false,"fork":false,"pushed_at":"2023-08-22T18:46:05.000Z","size":111,"stargazers_count":1136,"open_issues_count":31,"forks_count":136,"subscribers_count":30,"default_branch":"main","last_synced_at":"2025-04-14T16:57:59.688Z","etag":null,"topics":["artificial-intelligence","attention-mechanism","deep-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lucidrains.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2021-03-05T02:58:52.000Z","updated_at":"2025-03-27T12:57:45.000Z","dependencies_parsed_at":"2023-02-01T11:45:33.719Z","dependency_job_id":"b5e895e1-dd2f-4448-917d-bb699555e74b","html_url":"https://github.com/lucidrains/perceiver-pytorch","commit_stats":{"total_commits":69,"total_committers":4,"mean_commits":17.25,"dds":0.05797101449275366,"last_synced_commit":"d6e3cda8abfbadfc24c3092bb9babfaa97dca8cd"},"previous_names":[],"tags_count":53,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fperceiver-pytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fperceiver-pytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fperceiver-pytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fperceiver-pytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lucidrains","download_url":"https://codeload.github.com/lucidrains/perceiver-pytorch/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254319720,"owners_count":22051073,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","attention-mechanism","deep-learning"],"created_at":"2024-07-31T03:02:10.806Z","updated_at":"2025-05-15T10:07:24.904Z","avatar_url":"https://github.com/lucidrains.png","language":"Python","funding_links":[],"categories":["Python","时间序列"],"sub_categories":["网络服务_其他"],"readme":"\u003cimg src=\"./perceiver.png\" width=\"600px\"\u003e\u003c/img\u003e\n\n## Perceiver - Pytorch\n\nImplementation of \u003ca href=\"https://arxiv.org/abs/2103.03206\"\u003ePerceiver\u003c/a\u003e, General Perception with Iterative Attention, in Pytorch\n\n\u003ca href=\"https://www.youtube.com/watch?v=P_xeshTnPZg\"\u003eYannic Kilcher explanation!\u003c/a\u003e\n\n## Install\n\n```bash\n$ pip install perceiver-pytorch\n```\n\n## Usage\n\n```python\nimport torch\nfrom perceiver_pytorch import Perceiver\n\nmodel = Perceiver(\n    input_channels = 3,          # number of channels for each token of the input\n    input_axis = 2,              # number of axis for input data (2 for images, 3 for video)\n    num_freq_bands = 6,          # number of freq bands, with original value (2 * K + 1)\n    max_freq = 10.,              # maximum frequency, hyperparameter depending on how fine the data is\n    depth = 6,                   # depth of net. The shape of the final attention mechanism will be:\n                                 #   depth * (cross attention -\u003e self_per_cross_attn * self attention)\n    num_latents = 256,           # number of latents, or induced set points, or centroids. different papers giving it different names\n    latent_dim = 512,            # latent dimension\n    cross_heads = 1,             # number of heads for cross attention. paper said 1\n    latent_heads = 8,            # number of heads for latent self attention, 8\n    cross_dim_head = 64,         # number of dimensions per cross attention head\n    latent_dim_head = 64,        # number of dimensions per latent self attention head\n    num_classes = 1000,          # output number of classes\n    attn_dropout = 0.,\n    ff_dropout = 0.,\n    weight_tie_layers = False,   # whether to weight tie layers (optional, as indicated in the diagram)\n    fourier_encode_data = True,  # whether to auto-fourier encode the data, using the input_axis given. defaults to True, but can be turned off if you are fourier encoding the data yourself\n    self_per_cross_attn = 2      # number of self attention blocks per cross attention\n)\n\nimg = torch.randn(1, 224, 224, 3) # 1 imagenet image, pixelized\n\nmodel(img) # (1, 1000)\n```\n\nFor the backbone of \u003ca href=\"https://arxiv.org/abs/2107.14795\"\u003ePerceiver IO\u003c/a\u003e, the follow up paper that allows for flexible number of output sequence length, just import `PerceiverIO` instead\n\n```python\nimport torch\nfrom perceiver_pytorch import PerceiverIO\n\nmodel = PerceiverIO(\n    dim = 32,                    # dimension of sequence to be encoded\n    queries_dim = 32,            # dimension of decoder queries\n    logits_dim = 100,            # dimension of final logits\n    depth = 6,                   # depth of net\n    num_latents = 256,           # number of latents, or induced set points, or centroids. different papers giving it different names\n    latent_dim = 512,            # latent dimension\n    cross_heads = 1,             # number of heads for cross attention. paper said 1\n    latent_heads = 8,            # number of heads for latent self attention, 8\n    cross_dim_head = 64,         # number of dimensions per cross attention head\n    latent_dim_head = 64,        # number of dimensions per latent self attention head\n    weight_tie_layers = False,   # whether to weight tie layers (optional, as indicated in the diagram)\n    seq_dropout_prob = 0.2       # fraction of the tokens from the input sequence to dropout (structured dropout, for saving compute and regularizing effects)\n)\n\nseq = torch.randn(1, 512, 32)\nqueries = torch.randn(128, 32)\n\nlogits = model(seq, queries = queries) # (1, 128, 100) - (batch, decoder seq, logits dim)\n```\n\nAs an example, using PerceiverIO as a language model\n\n```python\nimport torch\nfrom perceiver_pytorch import PerceiverLM\n\nmodel = PerceiverLM(\n    num_tokens = 20000,          # number of tokens\n    dim = 32,                    # dimension of sequence to be encoded\n    depth = 6,                   # depth of net\n    max_seq_len = 2048,          # maximum sequence length\n    num_latents = 256,           # number of latents, or induced set points, or centroids. different papers giving it different names\n    latent_dim = 512,            # latent dimension\n    cross_heads = 1,             # number of heads for cross attention. paper said 1\n    latent_heads = 8,            # number of heads for latent self attention, 8\n    cross_dim_head = 64,         # number of dimensions per cross attention head\n    latent_dim_head = 64,        # number of dimensions per latent self attention head\n    weight_tie_layers = False    # whether to weight tie layers (optional, as indicated in the diagram)\n)\n\nseq = torch.randint(0, 20000, (1, 512))\nmask = torch.ones(1, 512).bool()\n\nlogits = model(seq, mask = mask) # (1, 512, 20000)\n```\n\n## Experimental\n\nI have also included a version of Perceiver that includes bottom-up (in addition to top-down) attention, using the same scheme as presented in the original \u003ca href=\"https://arxiv.org/abs/1810.00825\"\u003eSet Transformers\u003c/a\u003e paper as the \u003ca href=\"https://github.com/lucidrains/isab-pytorch\"\u003eInduced Set Attention Block\u003c/a\u003e.\n\nYou simply have to change the above import to\n\n```python\nfrom perceiver_pytorch.experimental import Perceiver\n```\n\n## Citations\n\n```bibtex\n@misc{jaegle2021perceiver,\n    title   = {Perceiver: General Perception with Iterative Attention},\n    author  = {Andrew Jaegle and Felix Gimeno and Andrew Brock and Andrew Zisserman and Oriol Vinyals and Joao Carreira},\n    year    = {2021},\n    eprint  = {2103.03206},\n    archivePrefix = {arXiv},\n    primaryClass = {cs.CV}\n}\n```\n\n```bibtex\n@misc{jaegle2021perceiver,\n    title   = {Perceiver IO: A General Architecture for Structured Inputs \u0026 Outputs},\n    author  = {Andrew Jaegle and Sebastian Borgeaud and Jean-Baptiste Alayrac and Carl Doersch and Catalin Ionescu and David Ding and Skanda Koppula and Andrew Brock and Evan Shelhamer and Olivier Hénaff and Matthew M. Botvinick and Andrew Zisserman and Oriol Vinyals and João Carreira},\n    year    = {2021},\n    eprint  = {2107.14795},\n    archivePrefix = {arXiv},\n    primaryClass = {cs.LG}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucidrains%2Fperceiver-pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flucidrains%2Fperceiver-pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucidrains%2Fperceiver-pytorch/lists"}