https://github.com/dingmyu/VRDP
[NeurIPS 2021] Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language
https://github.com/dingmyu/VRDP
Last synced: 3 months ago
JSON representation
[NeurIPS 2021] Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language
- Host: GitHub
- URL: https://github.com/dingmyu/VRDP
- Owner: dingmyu
- License: mit
- Created: 2021-10-16T15:58:02.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-04-11T18:24:06.000Z (about 2 years ago)
- Last Synced: 2024-08-01T02:25:52.567Z (10 months ago)
- Language: Python
- Homepage:
- Size: 19.3 MB
- Stars: 45
- Watchers: 2
- Forks: 7
- Open Issues: 3
-
Metadata Files:
- Readme: Readme.md
- License: LICENSE
Awesome Lists containing this project
README
# VRDP (NeurIPS 2021)
**[Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language](https://arxiv.org/abs/2110.15358)**
[Mingyu Ding](https://dingmyu.github.io/),
[Zhenfang Chen](https://zfchenunique.github.io/),
[Tao Du](https://people.csail.mit.edu/taodu/),
[Ping Luo](http://luoping.me/),
[Joshua B. Tenenbaum](https://web.mit.edu/cocosci/josh.html), and
[Chuang Gan](http://people.csail.mit.edu/ganchuang/)
More details can be found at the [Project Page](http://vrdp.csail.mit.edu/).
If you find our work useful in your research please consider citing our paper:
@inproceedings{ding2021dynamic,
author = {Ding, Mingyu and Chen, Zhenfang and Du, Tao and Luo, Ping and Tenenbaum, Joshua B and Gan, Chuang},
title = {Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language},
booktitle = {Advances In Neural Information Processing Systems},
year = {2021}
}## Prerequisites
- Python 3
- PyTorch 1.3 or higher
- All relative packages are covered by Miniconda
- Both CPUs and GPUs are supported## Dataset preparation
- Download videos, video annotation, questions and answers, and object proposals accordingly from the [official website](http://clevrer.csail.mit.edu/#)
- Transform videos into ".png" frames with ffmpeg.
- Organize the data as shown below.
```
clevrer
├── annotation_00000-01000
│ ├── annotation_00000.json
│ ├── annotation_00001.json
│ └── ...
├── ...
├── image_00000-01000
│ │ ├── 1.png
│ │ ├── 2.png
│ │ └── ...
│ └── ...
├── ...
├── questions
│ ├── train.json
│ ├── validation.json
│ └── test.json
├── proposals
│ ├── proposal_00000.json
│ ├── proposal_00001.json
│ └── ...
```- We also provide data for physics learning and program execution in [Google Drive](https://connecthkuhk-my.sharepoint.com/:f:/g/personal/u3007305_connect_hku_hk/Emlb-yHsV6ZLjDcVAxl7TOYBPkMA6pDcA505dtsIEQ1cqQ?e=0lQuoY).
You can download them optionally and put them in the `./data/` folder.- Download the processed data [executor_data.zip](https://connecthkuhk-my.sharepoint.com/:f:/g/personal/u3007305_connect_hku_hk/Emlb-yHsV6ZLjDcVAxl7TOYBPkMA6pDcA505dtsIEQ1cqQ?e=0lQuoY) for the executor. Put it in and unzip it to `./executor/data/`.
## Get Object Dictionaries (Concepts and Trajectories)
Download the [object proposals](http://clevrer.csail.mit.edu/#) from the region proposal network and follow the `Step-by-step Training` in [DCL](https://github.com/zfchenUnique/DCL-Release) to get object concepts and trajectories.
The above process includes:
- trajectory extraction
- concept learning
- trajectory refinementOr you can download our extracted object dictionaries [object_dicts.zip](https://connecthkuhk-my.sharepoint.com/:f:/g/personal/u3007305_connect_hku_hk/Emlb-yHsV6ZLjDcVAxl7TOYBPkMA6pDcA505dtsIEQ1cqQ?e=0lQuoY) directly from Google Drive.
## Learning
### 1. Differentiable Physics Learning
After we get the above object dictionaries, we learn physical parameters from object properties and trajectories.
```shell
cd dynamics/
python3 learn_dynamics.py 10000 15000
# Here argv[1] and argv[2] represent the start and end processing index respectively.
```The output object physical parameters [object_dicts_with_physics.zip](https://connecthkuhk-my.sharepoint.com/:f:/g/personal/u3007305_connect_hku_hk/Emlb-yHsV6ZLjDcVAxl7TOYBPkMA6pDcA505dtsIEQ1cqQ?e=0lQuoY) can be downloaded from Google Drive.
### 2. Physics Simulation (counterfactual)
Physical simulation using learned physical parameters.
```shell
cd dynamics/
python3 physics_simulation.py 10000 15000
# Here argv[1] and argv[2] represent the start and end processing index respectively.
```The output simulated trajectories/events [object_simulated.zip](https://connecthkuhk-my.sharepoint.com/:f:/g/personal/u3007305_connect_hku_hk/Emlb-yHsV6ZLjDcVAxl7TOYBPkMA6pDcA505dtsIEQ1cqQ?e=0lQuoY) can be downloaded from Google Drive.
### 3. Physics Simulation (predictive)
Correction of long-range prediction according to video observations.
```shell
cd dynamics/
python3 refine_prediction.py 10000 15000
# Here argv[1] and argv[2] represent the start and end processing index respectively.
```The output refined trajectories/events [object_updated_results.zip](https://connecthkuhk-my.sharepoint.com/:f:/g/personal/u3007305_connect_hku_hk/Emlb-yHsV6ZLjDcVAxl7TOYBPkMA6pDcA505dtsIEQ1cqQ?e=0lQuoY) can be downloaded from Google Drive.
## Evaluation
After we get the final trajectories/events, we perform the neuro-symbolic execution and evaluate the performance on the validation set.
```shell
cd executor/
python3 evaluation.py
```The test json file for evaluation on [evalAI](https://eval.ai/web/challenges/challenge-page/667/overview) can be generated by
```shell
cd executor/
python3 get_results.py
```## The Generalized Clerver Dataset (counterfactual_mass)
- Download [causal_mass.zip](https://connecthkuhk-my.sharepoint.com/:f:/g/personal/u3007305_connect_hku_hk/Emlb-yHsV6ZLjDcVAxl7TOYBPkMA6pDcA505dtsIEQ1cqQ?e=0lQuoY) and [counterfactual_mass.zip](https://connecthkuhk-my.sharepoint.com/:f:/g/personal/u3007305_connect_hku_hk/Emlb-yHsV6ZLjDcVAxl7TOYBPkMA6pDcA505dtsIEQ1cqQ?e=0lQuoY) from Google Drive.
- Generate counterfactual data on the collision event by `python3 counterfactual_mass/generate_data.py`## Examples
- Predictive question

- Counterfactual question
## Acknowledgements
For questions regarding VRDP, feel free to post here or directly contact the author ([email protected]).