{"id":18382363,"url":"https://github.com/li-plus/ssm-vos","last_synced_at":"2025-07-18T16:02:23.561Z","repository":{"id":114149841,"uuid":"340597501","full_name":"li-plus/SSM-VOS","owner":"li-plus","description":"Separable Structure Modeling for Semi-supervised Video Object Segmentation","archived":false,"fork":false,"pushed_at":"2022-04-21T02:59:52.000Z","size":101,"stargazers_count":7,"open_issues_count":1,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-06T23:34:23.251Z","etag":null,"topics":["computer-vision","davis-challenge","machine-learning","segmentation","semi-supervised","video-object-segmentation","vos","youtube-vos"],"latest_commit_sha":null,"homepage":"https://ieeexplore.ieee.org/document/9356697","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/li-plus.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-02-20T08:01:45.000Z","updated_at":"2025-03-26T14:48:57.000Z","dependencies_parsed_at":"2023-05-01T18:33:51.703Z","dependency_job_id":null,"html_url":"https://github.com/li-plus/SSM-VOS","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/li-plus/SSM-VOS","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/li-plus%2FSSM-VOS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/li-plus%2FSSM-VOS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/li-plus%2FSSM-VOS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/li-plus%2FSSM-VOS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/li-plus","download_url":"https://codeload.github.com/li-plus/SSM-VOS/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/li-plus%2FSSM-VOS/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265791232,"owners_count":23829159,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","davis-challenge","machine-learning","segmentation","semi-supervised","video-object-segmentation","vos","youtube-vos"],"created_at":"2024-11-06T01:04:46.173Z","updated_at":"2025-07-18T16:02:23.556Z","avatar_url":"https://github.com/li-plus.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# SSM-VOS: Separable Structure Modeling for Semi-supervised Video Object Segmentation [[paper]](https://ieeexplore.ieee.org/document/9356697)\n\n![Framework](fig/framework.jpg)\n\nA PyTorch implementation of our paper [Separable Structure Modeling for Semi-supervised Video Object Segmentation](https://ieeexplore.ieee.org/document/9356697) by [Wencheng Zhu](https://woshiwencheng.github.io/), [Jiahao Li](https://github.com/li-plus), [Jiwen Lu](http://ivg.au.tsinghua.edu.cn/Jiwen_Lu/), and [Jie Zhou](http://www.au.tsinghua.edu.cn/info/1078/1635.htm). Published in [IEEE Transactions on Circuits and Systems for Video Technology](https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=76).\n\n## Getting Started\n\nFirst, clone this project to your local environment.\n\n```sh\ngit clone https://github.com/li-plus/SSM-VOS.git \u0026\u0026 cd SSM-VOS\n```\n\nIt is recommended to create a virtual environment with python \u003e= 3.6.\n\n```sh\nconda create --name ssm python=3.8\nconda activate ssm\n```\n\nInstall python dependencies.\n\n```sh\npip install -r requirements.txt\n```\n\n## Datasets Preparation\n\n### Downloading\n\nDownload [DAVIS 2016](https://davischallenge.org/davis2016/code.html), [DAVIS 2017](https://davischallenge.org/davis2017/code.html) train-val and test-dev, and [YouTube-VOS 2018](https://youtube-vos.org/dataset/vos/) datasets from their official websites. Note that for DAVIS 2016 or DAVIS 2017, only the 480p version is needed.\n\n```sh\nmkdir -p datasets \u0026\u0026 cd datasets\n# DAVIS 2016\nwget https://graphics.ethz.ch/Downloads/Data/Davis/DAVIS-data.zip\nunzip DAVIS-data.zip\nmv DAVIS DAVIS2016\n# DAVIS 2017 Train Val\nwget https://data.vision.ee.ethz.ch/csergi/share/davis/DAVIS-2017-trainval-480p.zip\nunzip DAVIS-2017-trainval-480p.zip\nmv DAVIS DAVIS2017\n# DAVIS 2017 Test Dev\nwget https://data.vision.ee.ethz.ch/csergi/share/davis/DAVIS-2017-test-dev-480p.zip\nunzip DAVIS-2017-test-dev-480p.zip\nmv DAVIS DAVIS2017_test\n# YouTube-VOS 2018\n# Need to sign up for a competition on CodaLab, and manually download the dataset.\n```\n\nIt is recommended to follow the below structure. If you have saved the datasets into other directory, you may need to make a symbolic link, or manually adjust the path to your datasets in [make_index.py](src/make_index.py).\n\n```\nSSM-VOS\n└── datasets\n    ├── DAVIS2016\n    │   ├── Annotations\n    │   ├── ImageSets\n    │   └── JPEGImages\n    ├── DAVIS2017\n    │   ├── Annotations\n    │   ├── ImageSets\n    │   └── JPEGImages\n    ├── DAVIS2017_test\n    │   ├── Annotations\n    │   ├── ImageSets\n    │   └── JPEGImages\n    └── YouTubeVOS\n        ├── train\n        │   ├── Annotations\n        │   ├── JPEGImages\n        │   └── meta.json\n        └── valid\n            ├── Annotations\n            ├── JPEGImages\n            └── meta.json\n```\n\n### Indexing\n\nTo simplify the codes for data loading, we firstly index training, validation, and test set for all datasets.\n\n```sh\npython make_index.py\n```\n\n## Evaluation\n\nOur pretrained models on DAVIS 2016, DAVIS 2017, and YouTube-VOS 2018 are available for download.\n\n```sh\nmkdir -p models/pretrained \u0026\u0026 cd models/pretrained\nwget https://www.dropbox.com/s/7dctisjdrl2b47c/ssm_davis16.pt -O ssm_davis16.pt\nwget https://www.dropbox.com/s/ew2d2gy3rldxob9/ssm_davis17.pt -O ssm_davis17.pt\nwget https://www.dropbox.com/s/jm24vm2puprcldz/ssm_youtube.pt -O ssm_youtube.pt\n```\n\nTo evaluate a given model on a specific dataset, specify the path to model and the corresponding split file. For example, to evaluate the pretrained model on DAVIS 2017, run\n\n```sh\nCUDA_VISIBLE_DEVICES=0 python evaluate.py --split ../splits/davis2017_val.json \\\n    --resume ../models/pretrained/ssm_davis17.pt --save-dir ../models/pretrained/results/davis17/\n```\n\nThe script will generate separate mask results for each object and save them in the given `--save-dir`. We then merge the separate results into final masks.\n\n```sh\npython merge_masks.py -i ../models/pretrained/results/davis17/ \\\n    -o ../models/pretrained/results/davis17_merged/\n```\n\nTo evaluate the performance on DAVIS 2017, we apply the [official evaluation codes for DAVIS 2017](https://github.com/davisvideochallenge/davis2017-evaluation). Please follow its instructions to evaluate the final results.\n\nSimilarly, to evaluate our pretrained model on DAVIS 2016, run\n\n```sh\nCUDA_VISIBLE_DEVICES=0 python evaluate.py --split ../splits/davis2016_val.json \\\n    --resume ../models/pretrained/ssm_davis16.pt --save-dir ../models/pretrained/results/davis16/\n\npython merge_masks.py -i ../models/pretrained/results/davis16/ \\\n    -o ../models/pretrained/results/davis16_merged/\n```\n\nPlease use the [official evaluation codes for DAVIS 2016](https://github.com/davisvideochallenge/davis-2017) to evaluate the final mask results.\n\n## Training\n\n### YouTube-VOS 2018\n\nWe pretrain our model on YouTube-VOS with 4 GeForce GTX 1080 Ti GPU devices.\n\n```sh\nCUDA_VISIBLE_DEVICES=0,1,2,3 python train.py \\\n    --model-dir ../models/youtube --split ../splits/youtube_train.json\n```\n\nYou may start tensorboard to keep track of the training process.\n\n```sh\ntensorboard --logdir ../models/youtube/board\n```\n\n### DAVIS 2017\n\nFor better performance, we further train our model only on DAVIS 2017 based on the best pretrained model, say 80999.pt.\n\n```sh\nCUDA_VISIBLE_DEVICES=0,1,2,3 python train.py \\\n    --model-dir ../models/davis17 --split ../splits/davis2017_train.json \\\n    --resume ../models/youtube/checkpoints/80999.pt --max-epoch 60 \\\n    --base-lr 1e-6 --save-step 640 --lr-decay-step 1920\n```\n\n### DAVIS 2016\n\nSimilarly, we also train our model on DAVIS 2016.\n\n```sh\nCUDA_VISIBLE_DEVICES=0,1,2,3 python train.py \\\n    --model-dir ../models/davis16 --split ../splits/davis2016_train.json \\\n    --resume ../models/youtube/checkpoints/80999.pt --max-epoch 100 \\\n    --base-lr 1e-6 --save-step 130 --lr-decay-step 650\n```\n\n## Citation\n\nIf you find our paper or code helpful in your research, feel free to cite it.\n\n```\n@article{zhu2021separable,\n  title={Separable Structure Modeling for Semi-supervised Video Object Segmentation},\n  author={Zhu, Wencheng and Li, Jiahao and Lu, Jiwen and Zhou, Jie},\n  journal={IEEE Transactions on Circuits and Systems for Video Technology},\n  year={2021},\n  publisher={IEEE}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fli-plus%2Fssm-vos","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fli-plus%2Fssm-vos","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fli-plus%2Fssm-vos/lists"}