{"id":26669218,"url":"https://github.com/silky1708/locate","last_synced_at":"2025-08-20T22:25:33.232Z","repository":{"id":192632556,"uuid":"684014355","full_name":"silky1708/LOCATE","owner":"silky1708","description":"[BMVC 2023] Official repository for LOCATE: Self-supervised Object Discovery via Flow-guided Graph-cut and Bootstrapped Self-training","archived":false,"fork":false,"pushed_at":"2024-02-11T16:15:57.000Z","size":31772,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-02-11T17:27:52.909Z","etag":null,"topics":["bmvc2023","object-discovery","segmentation","self-supervised-learning"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2308.11239","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/silky1708.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2023-08-28T09:14:48.000Z","updated_at":"2024-02-11T17:27:53.870Z","dependencies_parsed_at":null,"dependency_job_id":"ac63904c-8a05-4568-824e-855a0859d494","html_url":"https://github.com/silky1708/LOCATE","commit_stats":null,"previous_names":["silky1708/locate"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/silky1708%2FLOCATE","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/silky1708%2FLOCATE/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/silky1708%2FLOCATE/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/silky1708%2FLOCATE/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/silky1708","download_url":"https://codeload.github.com/silky1708/LOCATE/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245550497,"owners_count":20633871,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bmvc2023","object-discovery","segmentation","self-supervised-learning"],"created_at":"2025-03-25T21:39:23.447Z","updated_at":"2025-08-20T22:25:33.210Z","avatar_url":"https://github.com/silky1708.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# LOCATE\n[BMVC 2023] Official repository for \"LOCATE: Self-supervised Object Discovery via Flow-guided Graph-cut and Bootstrapped Self-training\"  \n*Silky Singh, Shripad Deshmukh, Mausoom Sarkar, Balaji Krishnamurthy.*  \n\n[project page](https://silky1708.github.io/LOCATE/) | [arXiv](https://arxiv.org/abs/2308.11239) | [bibtex](https://github.com/silky1708/LOCATE/tree/main#citation)  \n\n\n![qual results](assets/locate_VOS_qual.png)\n\nOur self-supervised framework LOCATE trained on video datasets can perform object segmentation on standalone images. \n\n\u003c!-- ![model pipeline](assets/model_pipeline.png) --\u003e\n\n## Installation\n\n### Create a conda environment\n\n```\nconda create -n locate python=3.8\nconda activate locate\n```\n\nThe code has been tested with `python=3.8`, `pytorch=1.12.1`, `torchvision=0.13.1` with `cudatoolkit=11.3` on Nvidia A100 machine.\n\nUse the official Pytorch installation instructions provided [here](https://pytorch.org/get-started/previous-versions/). Other dependencies can be installed following the [guess-what-moves](https://github.com/karazijal/guess-what-moves) repository. It is mentioned below for completeness.\n\n```\nconda install -y pytorch==1.12.1 torchvision==0.13.1 cudatoolkit=11.3 -c pytorch\nconda install -y kornia jupyter tensorboard timm einops scikit-learn scikit-image openexr-python tqdm gcc_linux-64=11 gxx_linux-64=11 fontconfig -c conda-forge\npip install cvbase opencv-python wandb \npython -m pip install 'git+https://github.com/facebookresearch/detectron2.git'\n```\n\n\n## Datasets\n\nWe have tested our method on video object segmentation datasets (DAVIS 2016, FBMS59, SegTrackv2), image saliency detection (DUTS, ECSSD, OMRON) and object segmentation (CUB, Flowers-102) benchmarks.  \n\n\n## Training\n\n### Step 1. Graph Cut\n\nWe utilise the MaskCut algorithm from the CutLER's repository [[link](https://github.com/facebookresearch/CutLER)] with `N=1` to get the segmentation mask for the salient object in all the video frames independently. We modify the pipeline to take in optical flow features of the video frame, and combine both image and flow feature similarities in a linear combination to produce edge weights. The modified code can be found in the `CutLER` directory. \n\nWe perform a single round of post-processing using Conditional Random Fields (CRF) to get pixel-level segmentation masks. The graphcut masks for all the datasets are released [here](https://www.dropbox.com/scl/fo/wdr6jxutv9x4zte1n8jyz/h?rlkey=ayfmd4dp03tjdg6a2m0xg4iac\u0026dl=0). We use [ARFlow](https://github.com/lliuz/ARFlow) trained on the synthetic Sintel dataset to compute the optical flow between video frames.\n\n\n### Step 2. Bootstrapped Self-training\n\nUsing segmentation masks from previous step as pseudo-ground-truth, we train a [segmentation](https://github.com/facebookresearch/MaskFormer) network. In the root directory, run `train.sh`.\n\n## Inference\n\nUse the test script for running inference: `python test.py`\n\n\n## Model Checkpoints\n\n| Dataset | Checkpoint path |\n| ------- | ---------- |\n| DAVIS16 | `locate_checkpoints/davis2016.pth` |\n| SegTrackv2 | `locate_checkpoints/segtrackv2.pth` |\n| FBMS59 (graph-cut masks) | `locate_checkpoints/fbms59_graphcut.pth` |\n| FBMS59 (zero-shot) | `locate_checkpoints/fbms59_zero_shot.pth` |\n| DAVIS16+STv2+FBMS | `locate_checkpoints/combined.pth` |\n\nThe checkpoints are released [here](https://www.dropbox.com/scl/fo/v2akgrbzyyvkgtr98x2ok/h?rlkey=wfhmcm26fb3ivirdpx6pdkdxb\u0026dl=0). The `combined.pth` checkpoint refers to the model trained on all the video datasets (DAVIS16, SegTrackv2, FBMS59) combined.\n\n## Acknowledgments\n\nThis repository is built upon [guess-what-moves](https://github.com/karazijal/guess-what-moves), [CutLER](https://github.com/facebookresearch/CutLER). We thank all the respective authors for open-sourcing their amazing work! \n\n\n\n## Citation\n\nIf you find this work useful, please consider citing:\n\n```\n@inproceedings{Singh_2023_BMVC,\nauthor    = {Silky Singh and Shripad V Deshmukh and Mausoom Sarkar and Balaji Krishnamurthy},\ntitle     = {LOCATE: Self-supervised Object Discovery via Flow-guided Graph-cut and Bootstrapped Self-training},\nbooktitle = {34th British Machine Vision Conference 2023, {BMVC} 2023, Aberdeen, UK, November 20-24, 2023},\npublisher = {BMVA},\nyear      = {2023},\nurl       = {https://papers.bmvc2023.org/0295.pdf}\n}\n```\n\n```\n@article{singh2023locate,\n  title={LOCATE: Self-supervised Object Discovery via Flow-guided Graph-cut and Bootstrapped Self-training},\n  author={Singh, Silky and Deshmukh, Shripad and Sarkar, Mausoom and Krishnamurthy, Balaji},\n  journal={arXiv preprint arXiv:2308.11239},\n  year={2023}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsilky1708%2Flocate","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsilky1708%2Flocate","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsilky1708%2Flocate/lists"}