{"id":20852438,"url":"https://github.com/dirtyharrylyl/symnet","last_synced_at":"2025-05-12T05:30:46.496Z","repository":{"id":97765084,"uuid":"245598219","full_name":"DirtyHarryLYL/SymNet","owner":"DirtyHarryLYL","description":"As a part of the HAKE project (HAKE-Object), code for SymNet (CVPR'20 and TPAMI'21).","archived":false,"fork":false,"pushed_at":"2022-12-19T17:04:45.000Z","size":2353,"stargazers_count":52,"open_issues_count":1,"forks_count":5,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-04-01T00:43:38.751Z","etag":null,"topics":["attribute-object","compositional-zero-shot-learning","deep-learning","object-recognition"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DirtyHarryLYL.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2020-03-07T08:50:37.000Z","updated_at":"2025-03-20T02:10:52.000Z","dependencies_parsed_at":null,"dependency_job_id":"a66e7c65-7935-4f58-b62c-7cb7dda1ab66","html_url":"https://github.com/DirtyHarryLYL/SymNet","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DirtyHarryLYL%2FSymNet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DirtyHarryLYL%2FSymNet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DirtyHarryLYL%2FSymNet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DirtyHarryLYL%2FSymNet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DirtyHarryLYL","download_url":"https://codeload.github.com/DirtyHarryLYL/SymNet/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253681873,"owners_count":21946821,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["attribute-object","compositional-zero-shot-learning","deep-learning","object-recognition"],"created_at":"2024-11-18T03:17:40.940Z","updated_at":"2025-05-12T05:30:44.972Z","avatar_url":"https://github.com/DirtyHarryLYL.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# SymNet\nAs a part of [HAKE](http://hake-mvig.cn/) project (HAKE-Object).\n\n#### **News**: (2022.12.19) HAKE 2.0 is accepted by TPAMI!\n\n(2022.12.7) We release a new project [OCL](https://mvig-rhos.com/ocl) ([paper](https://arxiv.org/abs/2212.02710)). Data and code are coming soon.\n\n(2022.11.19) We release the interactive object bounding boxes \u0026 classes in the interactions within AVA dataset (2.1 \u0026 2.2)! [HAKE-AVA](https://github.com/DirtyHarryLYL/HAKE-AVA), [[Paper]](https://arxiv.org/abs/2211.07501).\n\n(2022.03.28) We release the code of multiple attribute recognition mentioned in PAMI version\n\n(2022.02.14) We release the human body part state labels based on AVA: [HAKE-AVA](https://github.com/DirtyHarryLYL/HAKE-AVA).\n\n(2021.10.06) Our extended version of [SymNet](https://github.com/DirtyHarryLYL/SymNet) is accepted by TPAMI! Paper and code are coming soon.\n\n(2021.2.7) Upgraded [HAKE-Activity2Vec](https://github.com/DirtyHarryLYL/HAKE-Action-Torch/tree/Activity2Vec) is released! Images/Videos --\u003e human box + ID + skeleton + part states + action + representation. [[Description]](https://drive.google.com/file/d/1iZ57hKjus2lKbv1MAB-TLFrChSoWGD5e/view?usp=sharing), Full demo: [[YouTube]](https://t.co/hXiAYPXEuL?amp=1), [[bilibili]](https://www.bilibili.com/video/BV1s54y1Y76s)\n\n\u003c!-- (2020.10.27) The code of [IDN](https://github.com/DirtyHarryLYL/HAKE-Action-Torch/tree/IDN-(Integrating-Decomposing-Network)) ([Paper](https://arxiv.org/abs/2010.16219)) in NeurIPS'20 is released! --\u003e\n\n(2020.6.16) Our larger version [HAKE-Large](https://github.com/DirtyHarryLYL/HAKE#hake-large-for-instance-level-hoi-detection) (\u003e120K images, activity and part state labels) is released!\n\nThis is the code accompanying our CVPR'20 and TPAMI'21 papers: **Symmetry and Group in Attribute-Object Compositions** [![report](https://img.shields.io/badge/ArXiv-Paper-red)](https://arxiv.org/abs/2004.00587), **Learning Single/Multi-Attribute of Object with Symmetry and Group** [![report](https://img.shields.io/badge/ArXiv-Paper-red)](https://arxiv.org/abs/2110.04603)\n\n\u003c!-- **Symmetry and Group in Attribute-Object Compositions**. [[arXiv](https://arxiv.org/abs/2004.00587)]\n*[Yong-Lu Li](https://dirtyharrylyl.github.io/), [Yue Xu](https://silicx.github.io/), Xiaohan Mao, [Cewu Lu](http://mvig.sjtu.edu.cn/)*\n\n**Learning Single/Multi-Attribute of Object with Symmetry and Group**. [[arXiv](https://arxiv.org/abs/2110.04603)]\n*[Yong-Lu Li](https://dirtyharrylyl.github.io/), [Yue Xu](https://silicx.github.io/), [Xinyu Xu](https://xuxinyu.website) ,Xiaohan Mao, [Cewu Lu](http://mvig.sjtu.edu.cn/)* --\u003e\n\n![Overview](./data/overview.png)\n\nIf you find this repository useful for you, please consider citing our paper.\n```\n---SymNet-PAMI\n@article{li2021learning,\n  title={Learning Single/Multi-Attribute of Object with Symmetry and Group},\n  author={Li, Yong-Lu and Xu, Yue and Xu, Xinyu and Mao, Xiaohan and Lu, Cewu},\n  journal={TPAMI},\n  year={2021}\n}\n---SymNet-CVPR\n@inproceedings{li2020symmetry,\n\ttitle={Symmetry and Group in Attribute-Object Compositions},\n\tauthor={Li, Yong-Lu and Xu, Yue and Mao, Xiaohan and Lu, Cewu},\n\tbooktitle={CVPR},\n\tyear={2020}\n}\n```\n\n## Prerequisites\n\n**Packages**: Install using `pip install -r requirements.txt`\n\n**Datasets**: Download and re-arrange with:\n\t\n\tcd data; bash download_data.sh\n\n**Features and pretrained models**: Features for compositional ZSL (CZSL) setting\u003csup\u003e[1]\u003c/sup\u003e will be downloaded together with the datasets. Features for generalized compositional ZSL (GCZSL) setting\u003csup\u003e[2]\u003c/sup\u003e can be extracted using:\n\n\tpython utils/dataset/GCZSL_dataset.py [MIT/UT]\n\n\nFor multiple attribute recognition, we re-organize the metadata of aPY/SUN datasets with pre-extracted ResNet-50 feature in 4 files `{APY/SUN}_{train/test}.pkl`.\nYou can download them from [Link](https://drive.google.com/file/d/1xkdxbgBhE1S7HdeaUtn_8rtm5lE0Dx6z/view) and put them into `./data` folder.\n\nPretrained models and intermediate results can be downloaded from here: [Link](https://drive.google.com/drive/folders/1qcgAeEeXakX3-RsFM3pKfKsj7F18XBHA?usp=sharing). Please unzip the `obj_scores.zip` to `./data/obj_scores` and `weights.zip` to `./weights`.\n\n\n## Compositional Zero-shot Leaning (CZSL)\n\nThese are commands for the split and evaluation metrics introduced by [1].\n\n### Training a object classifier\n\nBefore training a SymNet model, train an object classifier by running:\n\n\tpython run_symnet.py --network fc_obj --name MIT_obj_lr3e-3 --data MIT --epoch 1500 --batchnorm --lr 3e-3\n\tpython run_symnet.py --network fc_obj --name UT_obj_lr1e-3 --data UT --epoch 300 --batchnorm --lr 1e-3\n\nThen store the intermediate object results:\n\n\tpython test_obj.py --network fc_obj --name MIT_obj_lr3e-3 --data MIT --epoch 1120 --batchnorm\n\tpython test_obj.py --network fc_obj --name UT_obj_lr1e-3 --data UT --epoch 140 --batchnorm\n\nThe results file will be stored in `./data/obj_scores` with names `MIT_obj_lr3e-3_ep1120.pkl` and `UT_obj_lr1e-3_ep140.pkl` (in the examples above).\n\n### Training a SymNet\n\nTo train a SymNet with the hyper-parameters in our paper, run:\n\n\tpython run_symnet.py --name MIT_best --data MIT --epoch 400 --obj_pred MIT_obj_lr3e-3_ep1120.pkl --batchnorm --lr 5e-4 --bz 512 --lambda_cls_attr 1 --lambda_cls_obj 0.01 --lambda_trip 0.03 --lambda_sym 0.05 --lambda_axiom 0.01\n\tpython run_symnet.py --name UT_best --data UT --epoch 700 --obj_pred UT_obj_lr1e-3_ep140.pkl --batchnorm  --wordvec onehot  --lr 1e-4 --bz 256 --lambda_cls_attr 1 --lambda_cls_obj 0.5 --lambda_trip 0.5 --lambda_sym 0.01 --lambda_axiom 0.03\n\n\n\n### Model Evaluation\n\n\tpython test_symnet.py --name MIT_best --data MIT --epoch 320 --obj_pred MIT_lr3e-3_ep1120.pkl --batchnorm\n\tpython test_symnet.py --name UT_best --data UT --epoch 600 --obj_pred UT_lr1e-3_ep140.pkl --wordvec onehot --batchnorm\n\n\n\nMethod | MIT (top-1) | MIT (top-2) |MIT (top-2) | UT (top-1) | UT (top-2) | UT (top-3)  \n-- | -- | -- | -- | -- | -- | -- |\nVisual Product  | 9.8/13.9 | 16.1 | 20.6 | 49.9 | / | / \nLabelEmbed (LE) | 11.2/13.4| 17.6 | 22.4 | 25.8 | / | / \n~- LEOR            | 4.5          | 6.2  | 11.8 |  /       | / | / \n~- LE + R          | 9.3          | 16.3 | 20.8 |  /       | / | / \n~- LabelEmbed+    | 14.8*         |  /   |  /   | 37.4| / | / \nAnalogousAttr | 1.4          |  /   |  /   | 18.3  |  /  |  /  \nRed Wine        | 13.1         | 21.2 | 27.6 | 40.3  |  /  |  /   \nAttOperator    | 14.2         | 19.6 | 25.1 | 46.2  | 56.6 | 69.2 \nTAFE-Net           | 16.4         | 26.4 | 33.0 | 33.2  |  /  |  /  \nGenModel       | 17.8         |  /   |  /   | 48.3  |  /  |  /  \n**SymNet (Ours)** | **19.9** | **28.2** | **33.8** | **52.1**  |**67.8** |  **76.0** \n\n\n\n## Generalized Compositional Zero-shot Leaning (GCZSL)\n\nThese are commands for the split and evaluation metrics introduced by [2].\n\n### Training a object classifier\n\n\tpython run_symnet.py --network fc_obj --data MITg --name MITg_obj_lr3e-3 --bz 2048 --test_bz 2048  --lr 3e-3 --epoch 1000 --batchnorm --fc_cls 1024\n\n\tpython run_symnet.py --network fc_obj --data UTg --name UTg_obj_lr1e-3 --bz 2048 --test_bz 2048 --lr 1e-3 --epoch 700 --batchnorm  --fc_cls 1024\t\t\t\n\nTo store the object classification results of both valid and test set, run:\n\n\tpython test_obj.py --network fc_obj --data MITg --name MITg_obj_lr3e-3 --bz 2048 --test_bz 2048  --epoch 980 --batchnorm --fc_cls 1024 --test_set val\n\tpython test_obj.py --network fc_obj --data MITg --name MITg_obj_lr3e-3 --bz 2048 --test_bz 2048  --epoch 980 --batchnorm --fc_cls 1024 --test_set test\n\n\tpython test_obj.py --network fc_obj --data UTg --name UTg_obj_lr1e-3 --bz 2048 --test_bz 2048 --epoch 660 --batchnorm  --fc_cls 1024 --test_set val\n\tpython test_obj.py --network fc_obj --data UTg --name UTg_obj_lr1e-3 --bz 2048 --test_bz 2048 --epoch 660 --batchnorm  --fc_cls 1024 --test_set test\n\n\n### Trainig a SymNet\nTo train a SymNet for GCZSL, run:\n\n\tpython run_symnet_gczsl.py --data MITg --name MITg_best --epoch 1000 --obj_pred MITg_obj_lr3e-3_val_ep980.pkl --test_set val --lr 3e-4 --bz 512 --test_bz 512 --batchnorm  --lambda_cls_attr 1 --lambda_cls_obj 0.01 --lambda_trip 1 --lambda_sym 0.02 --lambda_axiom 0.02 --triplet_margin 0.3\n\n\tpython run_symnet_gczsl.py --data UTg --name UTg_best --epoch 300 --obj_pred UTg_obj_lr1e-3_val_ep660.pkl --test_set val --lr 1e-3 --bz 512 --test_bz 512 --wordvec onehot --batchnorm --lambda_cls_attr 1 --lambda_cls_obj 0.01 --fc_compress 512 --lambda_trip 1 --lambda_sym 0.02 --lambda_axiom 0.01\n\n\n### Model Evaluation\n\t\n\tpython test_symnet_gczsl.py --data MITg --name MITg_best --epoch 1000 --obj_pred MITg_obj_lr3e-3_test_ep980.pkl --bz 512 --test_bz 512 --batchnorm  --triplet_margin 0.3 --test_set test --topk 1\n\tpython test_symnet_gczsl.py --data MITg --name MITg_best --epoch 1000 --obj_pred MITg_obj_lr3e-3_val_ep980.pkl --bz 512 --test_bz 512 --batchnorm  --triplet_margin 0.3 --test_set val --topk 1\n\n\tpython test_symnet_gczsl.py --data UTg --name UTg_best --epoch 290 --obj_pred UTg_obj_lr1e-3_test_ep660.pkl --bz 512 --test_bz 512 --batchnorm --wordvec onehot --fc_compress 512 --test_set test --topk 1\n\tpython test_symnet_gczsl.py --data UTg --name UTg_best --epoch 290 --obj_pred UTg_obj_lr1e-3_val_ep660.pkl --bz 512 --test_bz 512 --batchnorm --wordvec onehot --fc_compress 512 --test_set val --topk 1\n\n\nMIT-States evaluation results (with metrics of TMN\u003csup\u003e[2]\u003c/sup\u003e)\n\nModel | Val Top-1 AUC | Val Top-2 AUC | Val Top-3 AUC | Test Top-1 AUC | Test Top-2 AUC | Test Top-3 AUC | Seen | Unseen | HM\n-- | -- | -- | -- | -- | -- | -- | -- | -- | --\nAttOperator  | 2.5 | 6.2 | 10.1 | 1.6 | 4.7 | 7.6 | 14.3    | 17.4 | 9.9 \nRed Wine      | 2.9 | 7.3 | 11.8 | 2.4 | 5.7 | 9.3 | 20.7    | 17.9 | 11.6\nLabelEmbed+  | 3.0 | 7.6 | 12.2 | 2.0 | 5.6 | 9.4 | 15.0    | 20.1 | 10.7\nGenModel     | 3.1 | 6.9 | 10.5 | 2.3 | 5.7 | 8.8 | 24.8    | 13.4 | 11.2\nTMN               | 3.5 | 8.1 | 12.4 | 2.9 | 7.1 | 11.5| 20.2    | 20.1 | 13.0\n**SymNet (CVPR)** | **4.3** | **9.8** | **14.8** | **3.0** | **7.6** | **12.3** | 24.4 | **25.2** | **16.1**\n**SymNet (TPAMI)** | **5.4** | **11.6** | **16.6** | **4.5** | **10.1** | **15.0** | **26.2** | **26.3** | **16.8**\n**SymNet (Latest Update)** | **5.8** | **12.2** | **17.8** | **5.3** | **11.3** | **16.5** | **29.5** | **26.1** | **17.4**\n\nUT-Zappos evaluation results (with metrics of CAUSAL\u003csup\u003e[3]\u003c/sup\u003e)\n\nModel | Unseen | Seen | Harmonic | Closed | AUC\n-- | -- | -- | -- | -- | -- \nLabelEmbed  | 16.2 | 53.0 | 24.7 | 59.3 | 22.9\nAttOperator | 25.5 | 37.9 | 27.9 | 54.0 | 22.1\nTMN        | 10.3 | 54.3 | 17.4 | **62.0** | 25.4\nCAUSAL     | **28.0** | 37.0 | **30.6** | 58.6 | 26.4\n**SymNet (Ours)** | 10.3 | **56.3** | 24.1 | 58.7 | **26.8**\n\n\n## Multiple Attribute Recognition\n\n\n### Trainig a SymNet\nTo train a SymNet for multiple attribute recognition, run:\n\n\tpython run_symnet_multi.py --name APY_best --data APY --rmd_metric sigmoid --fc_compress 256 --rep_dim 128  --test_freq 1  --epoch 100 --batchnorm --lr 3e-3 --bz 128 --lambda_cls_attr 1 --lambda_trip 1 --lambda_sym 5e-2 --bce_neg_weight 0.05 --lambda_cls_obj 5e-2 --lambda_axiom 1e-3  --lambda_multi_rmd 5e-2  --lambda_atten 1\n\n\tpython run_symnet_multi.py --name SUN_best --data SUN --rmd_metric rmd --fc_compress 1536 --rep_dim 128 --test_freq 5 --epoch 150 --batchnorm --lr 5e-3 --bz 128  --lambda_cls_attr 1 --lambda_trip 5e-2 --lambda_sym 8e-3 --bce_neg_weight 0.4 --lambda_cls_obj 3e-1 --lambda_axiom 1e-3 --lambda_multi_rmd 6e-2 --lambda_atten 6e-1\n\n\n### Model Evaluation\n\t\n\tpython test_symnet_multi.py --data APY --name APY_best --epoch 78 --batchnorm --rep_dim 128 --fc_compress 256\n\tpython test_symnet_multi.py --data SUN --name SUN_best --epoch 95 --batchnorm --rep_dim 128 --fc_compress 1536\n\n\n\nEvaluation results on aPY and SUN (with metrics of mAUC)\n\nModel \t\t\t\t| aPY\t \t| SUN \t\t|\n-- \t\t\t\t\t| -- \t\t| -- \t\t| \nALE \t\t\t\t| 69.2 \t\t| 74.5  \t|\nHAP \t\t\t\t| 58.2 \t\t| 76.7\t\t|\nUDICA \t\t\t\t| 82.3 \t\t| 85.8\t\t|\nKDICA \t\t\t\t| 84.7 \t\t|\t/\t\t|\nUMF \t\t\t\t| 79.7 \t\t| 80.5\t\t|\nAMT \t\t\t\t| 84.5 \t\t| 82.5  \t|\nFMT \t\t\t\t| 70.5\t\t| 75.5  \t|\nGALM \t\t\t\t| 84.2 \t\t| 86.5\t\t|\n**SymNet (Ours)**\t| **86.1**\t| **88.4**  |\n\n## Tips\n\n### Use Customized Dataset\n\nTake UT as example, beside reorganizing the images to `data/ut-zap50k-original/images/[attribute]_[object]/`:\n\n- If you are using customized pairs composed by our provided attributes and objects, only the pair lists in `data/ut-zap50k-original/compositional-split/` need to be updated.\n\n- If you also use customized attributes and objects, there are several additional files to modify in folder `utils/aux_data/`:\n\n  1. `UT_attrs.json` and `UT_objs.json` are attribute and object list, stored as `dict`. The keys are original names and values are names in pre-trained GloVe vocabs.\n\n  2. `glove_UT.py` contains GloVe vectors for the attributes and objects. In our paper, `glove.6B.300d.txt` is used.\n\n  3. `UT_weight.py` contains loss weights for each individual attribute or object class (only `attr_weight` and `obj_weight`) (`pair_weight` is never used and can be set to 1). In practice, these weights can help the training on imbalanced data. Each weight is computed by **-log(p)**, where **p** is the occurrence frequency of an attribute or object in train set. E.g. a five-image dataset have attribute labels `[a,a,a,b,b]`, then the `attr_weight` for `a` and `b` is `[-log0.6, -log0.4]`. You may clip the values to prevent large or zero weights.\n\n\u003c!-- \n## TODO\n- [ ] Unified backbone\n- [ ] Tips for hyperparameters and tuning\n- [ ] Some possible tricks\n- [ ] New module for multi-label attribute recognition\n- [ ] Torch version --\u003e\n\n\n## Acknowledgement\nThe dataloader and evaluation code are based on [Attributes as Operators](https://github.com/Tushar-N/attributes-as-operators)\u003csup\u003e[1]\u003c/sup\u003e and [Task-Driven Modular Networks](https://github.com/facebookresearch/taskmodularnets)\u003csup\u003e[2]\u003c/sup\u003e.\n\n\n\n## Reference\n\n[1] [Attributes as Operators: Factorizing Unseen Attribute-Object Compositions](https://arxiv.org/abs/1803.09851)\n\n[2] [Task-Driven Modular Networks for Zero-Shot Compositional Learning](https://arxiv.org/abs/1905.05908)\n\n[3] [A causal view of compositional zero-shot recognition](https://arxiv.org/abs/2006.14610)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdirtyharrylyl%2Fsymnet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdirtyharrylyl%2Fsymnet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdirtyharrylyl%2Fsymnet/lists"}