{"id":15601016,"url":"https://github.com/lucidrains/glom-pytorch","last_synced_at":"2025-04-07T15:10:05.461Z","repository":{"id":54193855,"uuid":"343860770","full_name":"lucidrains/glom-pytorch","owner":"lucidrains","description":"An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates concepts from neural fields, top-down-bottom-up processing, and attention (consensus between columns), for emergent part-whole heirarchies from data","archived":false,"fork":false,"pushed_at":"2021-03-27T16:49:35.000Z","size":104,"stargazers_count":193,"open_issues_count":6,"forks_count":27,"subscribers_count":15,"default_branch":"main","last_synced_at":"2025-03-31T14:11:14.698Z","etag":null,"topics":["artificial-intelligence","deep-learning","geoffrey-hinton"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lucidrains.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-03-02T17:42:40.000Z","updated_at":"2024-11-22T18:56:13.000Z","dependencies_parsed_at":"2022-08-13T08:50:49.974Z","dependency_job_id":null,"html_url":"https://github.com/lucidrains/glom-pytorch","commit_stats":null,"previous_names":[],"tags_count":14,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fglom-pytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fglom-pytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fglom-pytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fglom-pytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lucidrains","download_url":"https://codeload.github.com/lucidrains/glom-pytorch/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247675607,"owners_count":20977378,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","deep-learning","geoffrey-hinton"],"created_at":"2024-10-03T02:11:48.710Z","updated_at":"2025-04-07T15:10:05.416Z","avatar_url":"https://github.com/lucidrains.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cimg src=\"./glom2.png\" width=\"400px\"\u003e\u003c/img\u003e\n\n\u003cimg src=\"./glom1.png\" width=\"600px\"\u003e\u003c/img\u003e\n\n## GLOM - Pytorch\n\nAn implementation of \u003ca href=\"https://arxiv.org/abs/2102.12627\"\u003eGlom\u003c/a\u003e, Geoffrey Hinton's new idea that integrates concepts from neural fields, top-down-bottom-up processing, and attention (consensus between columns) for learning emergent part-whole heirarchies from data.\n\n\u003ca href=\"https://www.youtube.com/watch?v=cllFzkvrYmE\"\u003eYannic Kilcher's video\u003c/a\u003e was instrumental in helping me to understand this paper\n\n## Install\n\n```bash\n$ pip install glom-pytorch\n```\n\n## Usage\n\n```python\nimport torch\nfrom glom_pytorch import Glom\n\nmodel = Glom(\n    dim = 512,         # dimension\n    levels = 6,        # number of levels\n    image_size = 224,  # image size\n    patch_size = 14    # patch size\n)\n\nimg = torch.randn(1, 3, 224, 224)\nlevels = model(img, iters = 12) # (1, 256, 6, 512) - (batch - patches - levels - dimension)\n```\n\nPass the `return_all = True` keyword argument on forward, and you will be returned all the column and level states per iteration, (including the initial state, number of iterations + 1). You can then use this to attach any losses to any level outputs at any time step.\n\nIt also gives you access to all the level data across iterations for clustering, from which one can inspect for the theorized islands in the paper.\n\n```python\nimport torch\nfrom glom_pytorch import Glom\n\nmodel = Glom(\n    dim = 512,         # dimension\n    levels = 6,        # number of levels\n    image_size = 224,  # image size\n    patch_size = 14    # patch size\n)\n\nimg = torch.randn(1, 3, 224, 224)\nall_levels = model(img, iters = 12, return_all = True) # (13, 1, 256, 6, 512) - (time, batch, patches, levels, dimension)\n\n# get the top level outputs after iteration 6\ntop_level_output = all_levels[7, :, :, -1] # (1, 256, 512) - (batch, patches, dimension)\n```\n\nDenoising self-supervised learning for encouraging emergence, as described by Hinton\n\n```python\nimport torch\nimport torch.nn.functional as F\nfrom torch import nn\nfrom einops.layers.torch import Rearrange\n\nfrom glom_pytorch import Glom\n\nmodel = Glom(\n    dim = 512,         # dimension\n    levels = 6,        # number of levels\n    image_size = 224,  # image size\n    patch_size = 14    # patch size\n)\n\nimg = torch.randn(1, 3, 224, 224)\nnoised_img = img + torch.randn_like(img)\n\nall_levels = model(noised_img, return_all = True)\n\npatches_to_images = nn.Sequential(\n    nn.Linear(512, 14 * 14 * 3),\n    Rearrange('b (h w) (p1 p2 c) -\u003e b c (h p1) (w p2)', p1 = 14, p2 = 14, h = (224 // 14))\n)\n\ntop_level = all_levels[7, :, :, -1]  # get the top level embeddings after iteration 6\nrecon_img = patches_to_images(top_level)\n\n# do self-supervised learning by denoising\n\nloss = F.mse_loss(img, recon_img)\nloss.backward()\n```\n\nYou can pass in the state of the column and levels back into the model to continue where you left off (perhaps if you are processing consecutive frames of a slow video, as mentioned in the paper)\n\n```python\nimport torch\nfrom glom_pytorch import Glom\n\nmodel = Glom(\n    dim = 512,\n    levels = 6,\n    image_size = 224,\n    patch_size = 14\n)\n\nimg1 = torch.randn(1, 3, 224, 224)\nimg2 = torch.randn(1, 3, 224, 224)\nimg3 = torch.randn(1, 3, 224, 224)\n\nlevels1 = model(img1, iters = 12)                   # image 1 for 12 iterations\nlevels2 = model(img2, levels = levels1, iters = 10) # image 2 for 10 iteratoins\nlevels3 = model(img3, levels = levels2, iters = 6)  # image 3 for 6 iterations\n```\n\n### Appreciation\n\nThanks goes out to \u003ca href=\"https://github.com/cfoster0\"\u003eCfoster0\u003c/a\u003e for reviewing the code\n\n### Todo\n\n- [ ] contrastive / consistency regularization of top-ish levels\n\n## Citations\n\n```bibtex\n@misc{hinton2021represent,\n    title   = {How to represent part-whole hierarchies in a neural network}, \n    author  = {Geoffrey Hinton},\n    year    = {2021},\n    eprint  = {2102.12627},\n    archivePrefix = {arXiv},\n    primaryClass = {cs.CV}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucidrains%2Fglom-pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flucidrains%2Fglom-pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucidrains%2Fglom-pytorch/lists"}