{"id":15601083,"url":"https://github.com/lucidrains/recurrent-interface-network-pytorch","last_synced_at":"2025-08-27T02:19:03.447Z","repository":{"id":65588206,"uuid":"581371827","full_name":"lucidrains/recurrent-interface-network-pytorch","owner":"lucidrains","description":"Implementation of Recurrent Interface Network (RIN), for highly efficient generation of images and video without cascading networks, in Pytorch","archived":false,"fork":false,"pushed_at":"2024-02-14T15:05:56.000Z","size":749,"stargazers_count":204,"open_issues_count":6,"forks_count":15,"subscribers_count":11,"default_branch":"main","last_synced_at":"2025-06-21T05:39:54.197Z","etag":null,"topics":["artificial-intelligence","attention-mechanisms","deep-learning","denoising-diffusion","image-generation","latents","video-generation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lucidrains.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2022-12-23T02:01:32.000Z","updated_at":"2025-06-03T07:45:56.000Z","dependencies_parsed_at":"2024-02-14T16:26:51.670Z","dependency_job_id":"67d7a532-4bad-477c-a55b-96ffb83211cb","html_url":"https://github.com/lucidrains/recurrent-interface-network-pytorch","commit_stats":null,"previous_names":[],"tags_count":37,"template":false,"template_full_name":null,"purl":"pkg:github/lucidrains/recurrent-interface-network-pytorch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Frecurrent-interface-network-pytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Frecurrent-interface-network-pytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Frecurrent-interface-network-pytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Frecurrent-interface-network-pytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lucidrains","download_url":"https://codeload.github.com/lucidrains/recurrent-interface-network-pytorch/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Frecurrent-interface-network-pytorch/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":272280330,"owners_count":24906114,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-27T02:00:09.397Z","response_time":76,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","attention-mechanisms","deep-learning","denoising-diffusion","image-generation","latents","video-generation"],"created_at":"2024-10-03T02:14:11.916Z","updated_at":"2025-08-27T02:19:03.425Z","avatar_url":"https://github.com/lucidrains.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cimg src=\"./images/rin.png\" width=\"500png\"\u003e\u003c/img\u003e\n\n\u003cimg src=\"./images/latent-self-conditioning.png\" width=\"600px\"\u003e\u003c/img\u003e\n\n## Recurrent Interface Network (RIN) - Pytorch\n\nImplementation of \u003ca href=\"https://arxiv.org/abs/2212.11972\"\u003eRecurrent Interface Network (RIN)\u003c/a\u003e, for highly efficient generation of images and video without cascading networks, in Pytorch. The author unawaredly reinvented the \u003ca href=\"https://github.com/lucidrains/isab-pytorch\"\u003einduced set-attention block\u003c/a\u003e from the \u003ca href=\"https://arxiv.org/abs/1810.00825\"\u003eset transformers\u003c/a\u003e paper. They also combine this with the self-conditioning technique from the \u003ca href=\"https://arxiv.org/abs/2208.04202\"\u003eBit Diffusion paper\u003c/a\u003e, specifically for the latents. The last ingredient seems to be a new noise function based around the sigmoid, which the author claims is better than cosine scheduler for larger images.\n\nThe big surprise is that the generations can reach this level of fidelity. Will need to verify this on my own machine\n\nAdditionally, we will try adding an extra linear attention on the main branch as well as self conditioning in the pixel-space.\n\nThe insight of being able to self-condition on any hidden state of the network as well as the newly proposed sigmoid noise schedule are the two main findings.\n\nThis repository also contains the ability to \u003ca href=\"https://arxiv.org/abs/2301.10972\"\u003enoise higher resolution images more\u003c/a\u003e, using the `scale` keyword argument on the `GaussianDiffusion` class. It also contains the simple linear gamma schedule proposed in that paper.\n\n## Appreciation\n\n- \u003ca href=\"https://stability.ai/\"\u003eStability.ai\u003c/a\u003e for the generous sponsorship to work on cutting edge artificial intelligence research\n\n## Install\n\n```bash\n$ pip install rin-pytorch\n```\n\n## Usage\n\n```python\nfrom rin_pytorch import GaussianDiffusion, RIN, Trainer\n\nmodel = RIN(\n    dim = 256,                  # model dimensions\n    image_size = 128,           # image size\n    patch_size = 8,             # patch size\n    depth = 6,                  # depth\n    num_latents = 128,          # number of latents. they used 256 in the paper\n    dim_latent = 512,           # can be greater than the image dimension (dim) for greater capacity\n    latent_self_attn_depth = 4, # number of latent self attention blocks per recurrent step, K in the paper\n).cuda()\n\ndiffusion = GaussianDiffusion(\n    model,\n    timesteps = 400,\n    train_prob_self_cond = 0.9,  # how often to self condition on latents\n    scale = 1.                   # this will be set to \u003c 1. for more noising and leads to better convergence when training on higher resolution images (512, 1024) - input noised images will be auto variance normalized\n).cuda()\n\ntrainer = Trainer(\n    diffusion,\n    '/path/to/your/images',\n    num_samples = 16,\n    train_batch_size = 4,\n    gradient_accumulate_every = 4,\n    train_lr = 1e-4,\n    save_and_sample_every = 1000,\n    train_num_steps = 700000,         # total training steps\n    ema_decay = 0.995,                # exponential moving average decay\n)\n\ntrainer.train()\n```\n\nResults will be saved periodically to the `./results` folder\n\nIf you would like to experiment with the `RIN` and `GaussianDiffusion` class outside the `Trainer`\n\n```python\nimport torch\nfrom rin_pytorch import RIN, GaussianDiffusion\n\nmodel = RIN(\n    dim = 256,                  # model dimensions\n    image_size = 128,           # image size\n    patch_size = 8,             # patch size\n    depth = 6,                  # depth\n    num_latents = 128,          # number of latents. they used 256 in the paper\n    latent_self_attn_depth = 4, # number of latent self attention blocks per recurrent step, K in the paper\n).cuda()\n\ndiffusion = GaussianDiffusion(\n    model,\n    timesteps = 1000,\n    train_prob_self_cond = 0.9,\n    scale = 1.\n)\n\ntraining_images = torch.randn(8, 3, 128, 128).cuda() # images are normalized from 0 to 1\nloss = diffusion(training_images)\nloss.backward()\n# after a lot of training\n\nsampled_images = diffusion.sample(batch_size = 4)\nsampled_images.shape # (4, 3, 128, 128)\n```\n\n## Todo\n\n- [ ] experiment with \u003ca href=\"https://github.com/lucidrains/bidirectional-cross-attention/issues\"\u003ebidirectional cross attention\u003c/a\u003e\n- [ ] add ability to use 2d sinusoidal pos emb, from simple vit paper\n\n## Citations\n\n```bibtex\n@misc{jabri2022scalable,\n    title   = {Scalable Adaptive Computation for Iterative Generation}, \n    author  = {Allan Jabri and David Fleet and Ting Chen},\n    year    = {2022},\n    eprint  = {2212.11972},\n    archivePrefix = {arXiv},\n    primaryClass = {cs.LG}\n}\n```\n\n```bibtex\n@inproceedings{Chen2023OnTI,\n    title   = {On the Importance of Noise Scheduling for Diffusion Models},\n    author  = {Ting Chen},\n    year    = {2023}\n}\n```\n\n```bibtex\n@article{Salimans2022ProgressiveDF,\n    title   = {Progressive Distillation for Fast Sampling of Diffusion Models},\n    author  = {Tim Salimans and Jonathan Ho},\n    journal = {ArXiv},\n    year    = {2022},\n    volume  = {abs/2202.00512}\n}\n```\n\n```bibtex\n@misc{https://doi.org/10.48550/arxiv.2302.01327,\n    doi     = {10.48550/ARXIV.2302.01327},\n    url     = {https://arxiv.org/abs/2302.01327},\n    author  = {Kumar, Manoj and Dehghani, Mostafa and Houlsby, Neil},\n    title   = {Dual PatchNorm},\n    publisher = {arXiv},\n    year    = {2023},\n    copyright = {Creative Commons Attribution 4.0 International}\n}\n```\n\n```bibtex\n@inproceedings{Hang2023EfficientDT,\n    title   = {Efficient Diffusion Training via Min-SNR Weighting Strategy},\n    author  = {Tiankai Hang and Shuyang Gu and Chen Li and Jianmin Bao and Dong Chen and Han Hu and Xin Geng and Baining Guo},\n    year    = {2023}\n}\n```\n\n```bibtex\n@inproceedings{dao2022flashattention,\n    title   = {Flash{A}ttention: Fast and Memory-Efficient Exact Attention with {IO}-Awareness},\n    author  = {Dao, Tri and Fu, Daniel Y. and Ermon, Stefano and Rudra, Atri and R{\\'e}, Christopher},\n    booktitle = {Advances in Neural Information Processing Systems},\n    year    = {2022}\n}\n```\n\n```bibtex\n@inproceedings{Hoogeboom2023simpleDE,\n    title   = {simple diffusion: End-to-end diffusion for high resolution images},\n    author  = {Emiel Hoogeboom and Jonathan Heek and Tim Salimans},\n    year    = {2023}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucidrains%2Frecurrent-interface-network-pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flucidrains%2Frecurrent-interface-network-pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucidrains%2Frecurrent-interface-network-pytorch/lists"}