{"id":18401710,"url":"https://github.com/borealisai/provide","last_synced_at":"2025-04-07T07:31:45.999Z","repository":{"id":37649292,"uuid":"271132335","full_name":"BorealisAI/PROVIDE","owner":"BorealisAI","description":"PROVIDE: A Probabilistic Framework for Unsupervised Video Decomposition (UAI 2021)","archived":false,"fork":false,"pushed_at":"2023-10-03T22:35:38.000Z","size":40799,"stargazers_count":13,"open_issues_count":4,"forks_count":1,"subscribers_count":6,"default_branch":"master","last_synced_at":"2024-04-18T03:18:37.567Z","etag":null,"topics":["probabilistic","pytorch","uai","unsupervised","video-decomposition"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BorealisAI.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-06-09T23:40:53.000Z","updated_at":"2024-04-18T03:18:37.568Z","dependencies_parsed_at":"2023-01-21T12:48:39.635Z","dependency_job_id":null,"html_url":"https://github.com/BorealisAI/PROVIDE","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BorealisAI%2FPROVIDE","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BorealisAI%2FPROVIDE/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BorealisAI%2FPROVIDE/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BorealisAI%2FPROVIDE/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BorealisAI","download_url":"https://codeload.github.com/BorealisAI/PROVIDE/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223274431,"owners_count":17118001,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["probabilistic","pytorch","uai","unsupervised","video-decomposition"],"created_at":"2024-11-06T02:39:42.692Z","updated_at":"2024-11-06T02:39:43.317Z","avatar_url":"https://github.com/BorealisAI.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\nThis is the code repository complementing the paper \"PROVIDE: A Probabilistic Framework for Unsupervised Video Decomposition\".  The pretrained models are included in the repo.\n\n\nGIF showing results for CLEVRER dataset       |  Scene Decomposition Experiment\n:-------------------------:|:-------------------------:\n![gif](gifs/clevrer_5obj.gif) |  ![gif](gifs/results_table.png )\n\n## Dependencies and Setup\n\n- scikit_image==0.16.2\n- ipdb==0.12.2\n- opencv_python==4.2.0.32\n- imageio==2.6.1\n- torchvision==0.4.0a0+9232c4a\n- h5py==2.7.1\n- numpy==1.18.0\n- torch==1.2.0\n- matplotlib==2.1.2\n- visdom==0.1.8.9\n- moviepy==1.0.3\n- Pillow==7.1.2\n- scikit_learn==0.23.0\n- skimage==0.0\n- tensorboardX==2.0\n\n\n## Experiments\n\n### Datasets\n\n#### Bouncing Balls\n- Please download the Bouncing Balls datasets 'balls4mass64.h5' and 'balls678\nmass64.h5' from oficial R-NEM[2] [website](https://github.com/sjoerdvansteenkiste/Relational-NEM).\n- Run the scripts (bb_binary.py, bb_binary678.py, bb_colored.py, bb_colored678.py and bb_colored678_4_colors.py) from the same directory where the .h5 files are. Like this:\n```bash\npython bb_binary.py \n```\nIt might take a while.\n\n#### CLEVRER\n- Train split, validation split and validation annotations should be obtained from the official CLEVRER [website](http://clevrer.csail.mit.edu/)[3]. We use the validation set as test set, because the test set does not contain annotations. Please unzip everything you download.\n- For testing, we trim the videos to a subsequence containing at least 3 objects and object motion. We compute these subsequences by running the script (slice_videos_from_annotations.py in the attached code) from the folder with the validation split and validation annotations.\n```bash\npython slice_videos_from_annotations.py\n```\n- The test set ground truth masks can be downloaded from [here](https://drive.google.com/file/d/1dRnBKRJXsEyKe0EaNq3SHK1KMiJOv71v/view). The masks and the preprocessed test videos will be grouped into separate folders based on the number of objects in a video.\n- To finish the dataset preparation please run the clevrer.py from the directory where the downloaded train data is. And run the clevrer_test.py and clevrer_masks_test.py from the directory where is the output of slice_videos_from_annotations.py together with the downloaded unzipped masks.\n\n\n### Test\nYou will need at least one GPU to run tests. We used GeForce GTX 10 series GPU. For the Python version we had Python 3.6.9. Use the following commands to test the models:\n \n- Bouncing Balls 4 balls/binary\n```bash\npython scripts/test.py --batch_size 1 --datapath /path/to/bb_binary/  --gt_datapath /path/to/bb_color --model_name bb_binary --T 6  --K 5\n```\nHere and everywhere below use the flag --max_num_frames to set the number of frames per video. Default is 30.\n\nFor enabling the frame prediction please use the --predict_frames flag followed by the desired number of predicted frames.\n\nAlso please make sure that you prepared both bb_binary and bb_color for this experiment if you want to compute the scores. bb_color is the GT for bb_binary. If you don't want to compute the scores and just want to visualise results then you can set the --gt_datapath to /path/to/bb_binary/.\n\nIf you want to visualise the outputs or generate latents walks add the batch numbers to batch_to_print and batch_to_print_latent in the test.py.\n\n- Bouncing Balls  4 balls/colored\n\n```bash\npython scripts/test.py --batch_size 1 --datapath /path/to/bb_color/  --gt_datapath /path/to/bb_color --model_name bb_color --T 6  --K 5\n```\n\n- Bouncing Balls  4-8 balls/binary\n```bash\npython scripts/test.py --batch_size 1 --datapath /path/to/bb_binary678/ --gt_datapath /path/to/bb_color678/ --model_name bb_binary --T 6  --K 9 --max_num_frames 10\n```\n\n- Bouncing Balls  4-8 balls/colored/ colors\n```bash\npython scripts/test.py --batch_size 1 --datapath /path/to/bb_color678/ --gt_datapath /path/to/bb_color678/ --model_name bb_color --T 6  --K 9 --max_num_frames 10\n```\n- Bouncing Balls  4-8 balls/colored/4 colors\n```bash\npython scripts/test.py --batch_size 1 --datapath /path/to/bb_color678_4_colors/ --gt_datapath /path/to/bb_color678/ --model_name bb_color --T 6  --K 9 --max_num_frames 10\n```\n\n- CLEVRER 3-5 objects\n```bash\npython scripts/test.py --batch_size 1 --datapath /path/to/clevrer345/ --gt_datapath /path/to/clevrer345masks/ --model_name clevrer --T 5  --K 6\n```\n- CLEVRER 6 objects\n```bash\npython scripts/test.py --batch_size 1 --datapath /path/to/clevrer6/ --gt_datapath /path/to/clevrer6masks/ --model_name clevrer --T 5  --K 6\n```\n\n\n### Train\n\nFor training models we used 8 GeForce GTX 10 series GPUs.\n\n- Bouncing balls binary\n```bash\npython scripts/train.py --batch_size 32 --max_num_frames 4 --datapath /path/to/bb_binary/ --model_name bb_binary --T 6  --K 5\n```\n- Bouncing balls color\n```bash\npython scripts/train.py --batch_size 32 --max_num_frames 4 --param_schedule --datapath /path/to/bb_color/ --model_name bb_color_train --T 6  --K 5\n```\n\n- CLEVRER\n```bash\npython scripts/train.py --batch_size 32 --max_num_frames 4 --param_schedule --datapath /path/to/clevrer/ --gt_datapath /path/to/clevrer/ --model_name clevrer --T 5  --K 6\n```\n\n\n## Acknowledgements\n\nWe thank Michael Kelly for allowing us to use his [implementation](https://github.com/MichaelKevinKelly/IODINE) of the IODINE[1] paper that has served as a backbone for developing this model.\n\n\n## References\n\n 1. Greff, K., Kaufmann, R.L., Kabra, R., Watters, N., Burgess, C., Zoran, D., Matthey, L., Botvinick, M., Lerchner,  A.:  Multi-object representation learning with iterative variational inference. https://arxiv.org/pdf/1903.00450.pdf. \n  \n 2. Van Steenkiste, S., Chang, M., Greff, K., Schmidhuber, J.: Relational neural expectation maximization: Unsupervised discovery of objects and their interactions. https://arxiv.org/pdf/1802.10353.pdf.\n        \n 3. Yi, K., Gan, C., Li, Y., Kohli, P., Wu, J., Torralba, A., Tenenbaum, J.B.: Clevrer: Collision events for video representation  and reasoning. https://arxiv.org/pdf/1910.01442.pdf.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fborealisai%2Fprovide","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fborealisai%2Fprovide","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fborealisai%2Fprovide/lists"}