{"id":15601028,"url":"https://github.com/lucidrains/rq-transformer","last_synced_at":"2025-08-19T14:14:14.626Z","repository":{"id":49716014,"uuid":"468843293","full_name":"lucidrains/RQ-Transformer","owner":"lucidrains","description":"Implementation of RQ Transformer, proposed in the paper \"Autoregressive Image Generation using Residual Quantization\"","archived":false,"fork":false,"pushed_at":"2022-04-19T22:18:02.000Z","size":35861,"stargazers_count":112,"open_issues_count":1,"forks_count":7,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-07-31T02:16:15.642Z","etag":null,"topics":["artificial-intelligence","attention-mechanism","deep-learning","image-generation","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lucidrains.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-03-11T17:31:52.000Z","updated_at":"2025-07-08T07:22:15.000Z","dependencies_parsed_at":"2022-09-06T17:50:45.296Z","dependency_job_id":null,"html_url":"https://github.com/lucidrains/RQ-Transformer","commit_stats":null,"previous_names":[],"tags_count":14,"template":false,"template_full_name":null,"purl":"pkg:github/lucidrains/RQ-Transformer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2FRQ-Transformer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2FRQ-Transformer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2FRQ-Transformer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2FRQ-Transformer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lucidrains","download_url":"https://codeload.github.com/lucidrains/RQ-Transformer/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2FRQ-Transformer/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271166354,"owners_count":24710465,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-19T02:00:09.176Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","attention-mechanism","deep-learning","image-generation","transformers"],"created_at":"2024-10-03T02:12:31.525Z","updated_at":"2025-08-19T14:14:14.590Z","avatar_url":"https://github.com/lucidrains.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cimg src=\"./rq-transformer.png\" width=\"500px\"\u003e\u003c/img\u003e\n\n## RQ-Transformer\n\nImplementation of \u003ca href=\"https://arxiv.org/abs/2203.01941\"\u003eRQ Transformer\u003c/a\u003e, which proposes a more efficient way of training multi-dimensional sequences autoregressively. This repository will only contain the transformer for now. You can use \u003ca href=\"https://github.com/lucidrains/vector-quantize-pytorch#residual-vq\"\u003ethis vector quantization library\u003c/a\u003e for the residual VQ.\n\nThis type of axial autoregressive transformer should be compatible with \u003ca href=\"https://github.com/lucidrains/nwt-pytorch\"\u003ememcodes\u003c/a\u003e, proposed in \u003ca href=\"https://arxiv.org/abs/2106.04283\"\u003eNWT\u003c/a\u003e. It would likely also work well with \u003ca href=\"https://github.com/lucidrains/vector-quantize-pytorch#multi-headed-vq\"\u003emulti-headed VQ\u003c/a\u003e\n\n## Install\n\n```bash\n$ pip install RQ-transformer\n```\n\n## Usage\n\n```python\nimport torch\nfrom rq_transformer import RQTransformer\n\nmodel = RQTransformer(\n    num_tokens = 16000,             # number of tokens, in the paper they had a codebook size of 16k\n    dim = 512,                      # transformer model dimension\n    max_spatial_seq_len = 1024,     # maximum positions along space\n    depth_seq_len = 4,              # number of positions along depth (residual quantizations in paper)\n    spatial_layers = 8,             # number of layers for space\n    depth_layers = 4,               # number of layers for depth\n    dim_head = 64,                  # dimension per head\n    heads = 8,                      # number of attention heads\n)\n\nx = torch.randint(0, 16000, (1, 1024, 4))\n\nloss = model(x, return_loss = True)\nloss.backward()\n\n# then after much training\n\nlogits = model(x)\n\n# and sample from the logits accordingly\n# or you can use the generate function\n\nsampled = model.generate(temperature = 0.9, filter_thres = 0.9) # (1, 1024, 4)\n```\n\nI also think there is something deeper going on, and have generalized this to any number of dimensions. You can use it by importing the `HierarchicalCausalTransformer`\n\n```python\nimport torch\nfrom rq_transformer import HierarchicalCausalTransformer\n\nmodel = HierarchicalCausalTransformer(\n    num_tokens = 16000,                   # number of tokens\n    dim = 512,                            # feature dimension\n    dim_head = 64,                        # dimension of attention heads\n    heads = 8,                            # number of attention heads\n    depth = (4, 4, 2),                    # 3 stages (but can be any number) - transformer of depths 4, 4, 2\n    max_seq_len = (16, 4, 5)              # the maximum sequence length of first, stage, then the fixed sequence length of all subsequent stages\n).cuda()\n\nx = torch.randint(0, 16000, (1, 10, 4, 5)).cuda()\n\nloss = model(x, return_loss = True)\nloss.backward()\n\n# after a lot training\n\nsampled = model.generate(temperature = 0.9, filter_thres = 0.9) # (1, 16, 4, 5)\n```\n\n## Todo\n\n- [ ] move hierarchical causal transformer to separate repository, seems to be working\n\n## Citations\n\n```bibtex\n@unknown{unknown,\n    author  = {Lee, Doyup and Kim, Chiheon and Kim, Saehoon and Cho, Minsu and Han, Wook-Shin},\n    year    = {2022},\n    month   = {03},\n    title   = {Autoregressive Image Generation using Residual Quantization}\n}\n```\n\n```bibtex\n@misc{press2021ALiBi,\n    title   = {Train Short, Test Long: Attention with Linear Biases Enable Input Length Extrapolation},\n    author  = {Ofir Press and Noah A. Smith and Mike Lewis},\n    year    = {2021},\n    url     = {https://ofir.io/train_short_test_long.pdf}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucidrains%2Frq-transformer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flucidrains%2Frq-transformer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucidrains%2Frq-transformer/lists"}