{"id":20434203,"url":"https://github.com/vt-vl-lab/sdn","last_synced_at":"2025-08-11T23:15:18.382Z","repository":{"id":104150681,"uuid":"217920896","full_name":"vt-vl-lab/SDN","owner":"vt-vl-lab","description":"[NeurIPS 2019] Why Can't I Dance in the Mall? Learning to Mitigate Scene Bias in Action Recognition","archived":false,"fork":false,"pushed_at":"2024-03-20T00:54:34.000Z","size":41,"stargazers_count":83,"open_issues_count":2,"forks_count":13,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-04-12T21:11:34.451Z","etag":null,"topics":["action-recognition","activity-recognition","debiasisng","representation-learning","video-understanding"],"latest_commit_sha":null,"homepage":"http://chengao.vision/SDN/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vt-vl-lab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-10-27T21:46:16.000Z","updated_at":"2025-03-12T16:06:38.000Z","dependencies_parsed_at":null,"dependency_job_id":"76e57799-bf52-4ddc-81f0-39ef844d5af4","html_url":"https://github.com/vt-vl-lab/SDN","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/vt-vl-lab/SDN","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vt-vl-lab%2FSDN","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vt-vl-lab%2FSDN/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vt-vl-lab%2FSDN/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vt-vl-lab%2FSDN/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vt-vl-lab","download_url":"https://codeload.github.com/vt-vl-lab/SDN/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vt-vl-lab%2FSDN/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":269970120,"owners_count":24505466,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-11T02:00:10.019Z","response_time":75,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["action-recognition","activity-recognition","debiasisng","representation-learning","video-understanding"],"created_at":"2024-11-15T08:25:04.981Z","updated_at":"2025-08-11T23:15:18.313Z","avatar_url":"https://github.com/vt-vl-lab.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# SDN: Scene Debiasing Network for Action Recognition in PyTorch\nWe release the code of the \"Why Can't I Dance in the Mall? Learning to Mitigate Scene Bias in Action Recognition\". The code is built upon the [3D-ResNets-PyTorch codebase](https://github.com/kenshohara/3D-ResNets-PyTorch).\n\nFor the details, visit our [project website](http://chengao.vision/SDN/) or see our [full paper](https://papers.nips.cc/paper/8372-why-cant-i-dance-in-the-mall-learning-to-mitigate-scene-bias-in-action-recognition.pdf).\n\n## Reference\n[Jinwoo Choi](https://sites.google.com/site/jchoivision/), [Chen Gao](https://gaochen315.github.io/), [Joseph C. E. Messou](https://josephcmessou.weebly.com/about.html), [Jia-Bin Huang](https://filebox.ece.vt.edu/~jbhuang/index.html). Why Can't I Dance in the Mall? Learning to Mitigate Scene Bias in Action Recognition. Neural Information Processing Systems (NeurIPS) 2019.\n\n```\n@inproceedings{choi2019sdn,\n    title = {Why Can't I Dance in the Mall? Learning to Mitigate Scene Bias in Action Recognition},\n    author = {Choi, Jinwoo and Gao, Chen and Messou, C. E. Joseph and Huang, Jia-Bin},\n    booktitle={NeurIPS},\n    year={2019}\n}\n```\n\n## Requirements\nThis codebase was developed and tested with:\n- Python 3.6\n- PyTorch 0.4.1\n- torchvision 0.2.1\n- CUDA 9.0\n- CUDNN 7.1\n- GPU: 2xP100 \n\nYou can find dependencies from `sdn_packages.txt`\n\nYou can install dependencies by\n```\npip install -r sdn_packages.txt \n```\n\n## Datasets\n### Prepare your dataset\n**1. Download and pre-process data**\n- Follow the [3D-ResNets-PyTorch instruction](https://github.com/kenshohara/3D-ResNets-PyTorch#preparation).\n\n**2. Download scene and human detection data numpy files**\n- [Download the Mini-Kinetics scene pseudo labels](https://filebox.ece.vt.edu/~jinchoi/files/sdn/places_data.zip)\n- [Download the Mini-Kinetics human detections](https://filebox.ece.vt.edu/~jinchoi/files/sdn/detections.zip)\n\n## Train\n### Training on a source dataset (mini-Kinetics)\n**- Baseline model without any debiasing**\n```\n python train.py \n --video_path \u003cyour dataset dir path\u003e \\\n --annotation_path \u003cyour dataset dir path\u003e/kinetics.json \\\n --result_path \u003cpath to save your model\u003e \\\n --root_path \u003cyour dataset dir path\u003e \\\n --dataset kinetics \\\n --n_classes 200 \\\n --n_finetune_classes 200 \\\n --model resnet \\\n --model_depth 18 \\\n --resnet_shortcut A \\\n --batch_size 32 \\\n --val_batch_size 16 \\\n --n_threads 16 \\\n --checkpoint 1 \\\n --ft_begin_index 0 \\\n --is_mask_adv \\\n --learning_rate 0.0001 \\\n --weight_decay 1e-5 \\\n --n_epochs 100 \\\n --pretrain_path \u003cpre-trained model file path\u003e\n ```\n \n**- SDN model with scene adversarial loss only**\n```\npython train.py \\\n--video_path \u003cyour dataset dir path\u003e \\\n--annotation_path \u003cyour dataset dir path\u003e/kinetics.json \\\n--result_path \u003cpath to save your model\u003e \\\n--root_path \u003cyour dataset dir path\u003e \\\n--dataset kinetics_adv \\\n--n_classes 200 \\\n--n_finetune_classes 200 \\\n--model resnet \\\n--model_depth 18 \\\n--resnet_shortcut A \\\n--batch_size 32 \\\n--val_batch_size 16 \\\n--n_threads 16 \\\n--checkpoint 1 \\\n--ft_begin_index 0 \\\n--num_place_hidden_layers 3 \\\n--new_layer_lr 1e-2 \\\n--learning_rate 1e-4 \\\n--warm_up_epochs 5 \\\n--weight_decay 1e-5 \\\n--n_epochs 100 \\\n--place_pred_path \u003cfull path of your kinetics pseudo scene labels\u003e \\\n--is_place_adv \\\n--is_place_soft \\\n--alpha 1.0 \\\n--is_mask_adv \\\n--num_places_classes 365 \\\n--pretrain_path \u003cpre-trained model file path\u003e\n```\n\n**- Full SDN model with 1) scene adversarial loss and 2) human mask confussion loss**\n```\npython train.py \\\n--video_path \u003cyour dataset dir path\u003e \\\n--annotation_path \u003cyour dataset dir path\u003e/kinetics.json \\\n--result_path \u003cpath to save your model\u003e \\\n--root_path \u003cyour dataset dir path\u003e \\\n--dataset kinetics_adv_msk \\\n--n_classes 200 \\\n--n_finetune_classes 200 \\\n--model resnet \\\n--model_depth 18 \\\n--resnet_shortcut A \\\n--batch_size 32 \\\n--val_batch_size 16 \\\n--n_threads 16 \\\n--checkpoint 1 \\\n--ft_begin_index 0 \\\n--num_place_hidden_layers 3 \\\n--num_human_mask_adv_hidden_layers 1 \\\n--new_layer_lr 1e-4 \\\n--learning_rate 1e-4 \\\n--warm_up_epochs 0 \\\n--weight_decay 1e-5 \\\n--n_epochs 100 \\\n--place_pred_path \u003cfull path of your kinetics pseudo scene labels\u003e \\\n--is_place_adv \\\n--is_place_soft \\\n--is_mask_entropy \\\n--alpha 0.5 \\\n--mask_ratio 1.0 \\\n--slower_place_mlp \\\n--not_replace_last_fc \\\n--num_places_classes 365 \\\n--human_dets_path \u003cfull path of your kinetics human detections\u003e \\\n--pretrain_path \u003cpre-trained model file path: e.g., your SDN model with scene adversarial loss only\u003e\n```\n\n### Finetuning on target datasets\n#### [Diving48](http://www.svcl.ucsd.edu/projects/resound/dataset.html) as an example\n```\npython train.py \\\n--dataset diving48 \\\n--root_path \u003cyour dataset path\u003e \\\n--video_path \u003cyour dataset path\u003e \\\n--n_classes 200 \\\n--n_finetune_classes 48 \\\n--model resnet \\\n--model_depth 18 \\\n--resnet_shortcut A \\\n--ft_begin_index 0 \\\n--batch_size 32 \\\n--val_batch_size 16 \\\n--n_threads 4 \\\n--checkpoint 1 \\\n--learning_rate 0.005 \\\n--weight_decay 1e-5 \\\n--n_epochs $epoch_ft \\\n--is_mask_adv \\\n--annotation_path $anno_path \\\n--result_path \u003cpath to save your fine-tuned model\u003e \\\n--pretrain_path \u003cpre-trained model file path: e.g., your full SDN model path\u003e\n```\n\n## Test\n```\npython train.py \\\n--dataset diving48 \\\n--root_path \u003cyour dataset path\u003e \\\n--video_path \u003cyour dataset path\u003e \\\n--n_finetune_classes 48 \\\n--n_classes 48 \\\n--model resnet \\\n--model_depth 18 \\\n--resnet_shortcut A \\\n--batch_size 32 \\\n--val_batch_size 16 \\\n--n_threads 4 \\\n--test \\\n--test_subset val \\\n--no_train \\\n--no_val \\\n--is_mask_adv \\\n--annotation_path $anno_path \\\n--result_path \u003cpath (dir) to save your fine-tuned model\u003e \\\n--resume_path \u003cpath (the model checkpoint file) to save your fine-tuned model\u003e\n```\nThis step will generate `val.json` file under `$result_path`.\n\n## Evaluation\n```\npython utils/eval_diving48.py \\\n--annotation_path $anno_path \\\n--prediction_path \u003cpath to your test result file (val.json) generated from the test step\u003e\n```\n\n## Pre-trained model weights provided\n[Download the pre-trained weights](https://www.dropbox.com/scl/fi/j2pgucu8gvpz3jp5ygl91/pre-trained_weights.tar?rlkey=gicecxrpj2o7ipjmhmx0hlcrl\u0026dl=0)\n\n## Acknowledgments\nThis code is built upon [3D-ResNets-PyTorch codebase](https://github.com/kenshohara/3D-ResNets-PyTorch). We thank to Kensho Hara. \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvt-vl-lab%2Fsdn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvt-vl-lab%2Fsdn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvt-vl-lab%2Fsdn/lists"}