{"id":19932038,"url":"https://github.com/amazon-science/adaslot","last_synced_at":"2025-05-03T11:31:03.270Z","repository":{"id":254553361,"uuid":"845882750","full_name":"amazon-science/AdaSlot","owner":"amazon-science","description":"Official implementation of the CVPR'24 paper [Adaptive Slot Attention: Object Discovery with Dynamic Slot Number]","archived":false,"fork":false,"pushed_at":"2024-11-01T15:35:40.000Z","size":979,"stargazers_count":22,"open_issues_count":6,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-11-01T16:26:49.975Z","etag":null,"topics":["object-centric","object-centric-learning"],"latest_commit_sha":null,"homepage":"https://kfan21.github.io/AdaSlot/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/amazon-science.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-22T05:37:24.000Z","updated_at":"2024-11-01T15:35:44.000Z","dependencies_parsed_at":"2024-08-24T09:56:37.178Z","dependency_job_id":null,"html_url":"https://github.com/amazon-science/AdaSlot","commit_stats":null,"previous_names":["amazon-science/adaslot"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2FAdaSlot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2FAdaSlot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2FAdaSlot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2FAdaSlot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/amazon-science","download_url":"https://codeload.github.com/amazon-science/AdaSlot/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224360233,"owners_count":17298319,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["object-centric","object-centric-learning"],"created_at":"2024-11-12T23:08:48.766Z","updated_at":"2024-11-12T23:08:49.365Z","avatar_url":"https://github.com/amazon-science.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"#  Official PyTorch Implementation of Adaptive Slot Attention: Object Discovery with Dynamic Slot Number\n[![ArXiv](https://img.shields.io/badge/ArXiv-2406.09196-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2406.09196)[![HomePage](https://img.shields.io/badge/HomePage-Visit-blue.svg?logo=homeadvisor\u0026logoColor=f5f5f5)](https://kfan21.github.io/AdaSlot/)![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)\n\u003e [**Adaptive Slot Attention: Object Discovery with Dynamic Slot Number**](https://arxiv.org/abs/2406.09196)\u003cbr\u003e\n\u003e  [Ke Fan](https://kfan21.github.io/), [Zechen Bai](https://www.baizechen.site/), [Tianjun Xiao](http://tianjunxiao.com/), [Tong He](https://hetong007.github.io/), [Max Horn](https://expectationmax.github.io/), [Yanwei Fu†](http://yanweifu.github.io/), [Francesco Locatello](https://www.francescolocatello.com/), [Zheng Zhang](https://scholar.google.com/citations?hl=zh-CN\u0026user=k0KiE4wAAAAJ)\n\n\nThis is the official implementation of the CVPR'24 paper [Adaptive Slot Attention: Object Discovery with Dynamic Slot Number]([CVPR 2024 Open Access Repository (thecvf.com)](https://openaccess.thecvf.com/content/CVPR2024/html/Fan_Adaptive_Slot_Attention_Object_Discovery_with_Dynamic_Slot_Number_CVPR_2024_paper.html)).\n\n## Introduction\n\n![framework](framework.png)\n\nObject-centric learning (OCL) uses slots to extract object representations, enhancing flexibility and interpretability. Slot attention, a common OCL method, refines slot representations with attention mechanisms but requires predefined slot numbers, ignoring object variability. To address this, a novel complexity-aware object auto-encoder framework introduces adaptive slot attention (AdaSlot), dynamically determining the optimal slot count based on data content through a discrete slot sampling module. A masked slot decoder suppresses unselected slots during decoding. Extensive testing shows this framework matches or exceeds fixed-slot models, adapting slot numbers based on instance complexity and promising further research opportunities.\n\n## News!\n- [2024.11.02] We released the pre-trained checkpoints! Please find them at this [link](https://drive.google.com/drive/folders/1SRKE9Q5XF2UeYj1XB8kyjxORDmB7c7Mz)!\n- [2024.08.24] We open-sourced the code!\n\n## Development Setup\n\nInstalling AdaSlot requires at least python3.8. Installation can be done using [poetry](https://python-poetry.org/docs/#installation).  After installing `poetry`, check out the repo and setup a development environment:\n\n```bash\n# install python3.8\nsudo apt update\nsudo apt install software-properties-common\nsudo add-apt-repository ppa:deadsnakes/ppa\nsudo apt install python3.8\n\n# install poetry with python3.8\ncurl -sSL https://install.python-poetry.org | python3.8 - --version 1.2.0\n## add poetry to environment variable\n\n# create virtual environment with poetry\ncd $code_path\npoetry install -E timm\n```\n\nThis installs the `ocl` package and the cli scripts used for running experiments in a poetry managed virtual environment. Activate the poetry virtual environment `poetry shell` before running the experiments.\n\n## Running experiments\n\nExperiments are defined in the folder `configs/experiment` and can be run\nby setting the experiment variable. For example, if we run OC-MOT on Cater dataset, we can follow: \n\n```bash\npoetry shell\n\npython -m ocl.cli.train +experiment=projects/bridging/dinosaur/movi_e_feat_rec_vitb16.yaml\npython -m ocl.cli.train +experiment=projects/bridging/dinosaur/movi_e_feat_rec_vitb16_adaslot.yaml +load_model_weight=PATH-TO-KMAX-SLOT-CHECKPOINT\npython -m ocl.cli.eval +experiment=projects/bridging/dinosaur/movi_e_feat_rec_vitb16_adaslot_eval.yaml ++load_checkpoint=PATH-TO-ADASLOT-CHECKPOINT\n\npython -m ocl.cli.train +experiment=projects/bridging/dinosaur/movi_c_feat_rec_vitb16.yaml\npython -m ocl.cli.train +experiment=projects/bridging/dinosaur/movi_c_feat_rec_vitb16_adaslot.yaml +load_model_weight=PATH-TO-KMAX-SLOT-CHECKPOINT\npython -m ocl.cli.eval +experiment=projects/bridging/dinosaur/movi_c_feat_rec_vitb16_adaslot_eval.yaml ++load_checkpoint=PATH-TO-ADASLOT-CHECKPOINT\n\npython -m ocl.cli.train +experiment=projects/bridging/dinosaur/coco_feat_rec_dino_base16.yaml\npython -m ocl.cli.train +experiment=projects/bridging/dinosaur/coco_feat_rec_dino_base16_adaslot.yaml +load_model_weight=PATH-TO-KMAX-SLOT-CHECKPOINT\npython -m ocl.cli.eval +experiment=projects/bridging/dinosaur/coco_feat_rec_dino_base16_adaslot_eval.yaml ++load_checkpoint=PATH-TO-ADASLOT-CHECKPOINT\n\npython -m ocl.cli.train +experiment=slot_attention/clevr10.yaml\npython -m ocl.cli.train +experiment=slot_attention/clevr10_adaslot.yaml +load_model_weight=PATH-TO-KMAX-SLOT-CHECKPOINT\npython -m ocl.cli.eval +experiment=slot_attention/clevr10_adaslot_eval.yaml ++load_checkpoint=PATH-TO-ADASLOT-CHECKPOINT\n```\n\nThe result is saved in a timestamped subdirectory in `outputs/\u003cexperiment_name\u003e`, i.e. `outputs/OC-MOT/cater/\u003cdate\u003e_\u003ctime\u003e` in the above case. The prefix path `outputs` can be configured using the `experiment.root_output_path` variable.\n\n## Citation\n\nPlease cite our paper if you find this repo useful!\n\n```bibtex\n@inproceedings{fan2024adaptive,\n  title={Adaptive slot attention: Object discovery with dynamic slot number},\n  author={Fan, Ke and Bai, Zechen and Xiao, Tianjun and He, Tong and Horn, Max and Fu, Yanwei and Locatello, Francesco and Zhang, Zheng},\n  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},\n  pages={23062--23071},\n  year={2024}\n}\n```\n\nRelated projects that this paper is developed upon:\n\n```bibtex\n@misc{oclf,\n  author = {Max Horn and Maximilian Seitzer and Andrii Zadaianchuk and Zixu Zhao and Dominik Zietlow and Florian Wenzel and Tianjun Xiao},\n  title = {Object Centric Learning Framework (version 0.1)},\n  year  = {2023},\n  url   = {https://github.com/amazon-science/object-centric-learning-framework},\n}\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famazon-science%2Fadaslot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Famazon-science%2Fadaslot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famazon-science%2Fadaslot/lists"}