{"id":31944438,"url":"https://github.com/kuohaozeng/visual_reaction","last_synced_at":"2025-10-14T10:26:10.395Z","repository":{"id":44193941,"uuid":"253993277","full_name":"KuoHaoZeng/Visual_Reaction","owner":"KuoHaoZeng","description":"Visual Reaction: Learning to Play Catch with Your Drone","archived":false,"fork":false,"pushed_at":"2023-07-23T11:16:18.000Z","size":2943,"stargazers_count":11,"open_issues_count":2,"forks_count":3,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-01-29T16:07:19.836Z","etag":null,"topics":["ai2-thor","computer-vision","drone","forecasting","reinforcement-learning","visual-reaction"],"latest_commit_sha":null,"homepage":"https://arxiv.org/pdf/1912.02155.pdf","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/KuoHaoZeng.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2020-04-08T05:25:21.000Z","updated_at":"2023-11-21T14:41:44.000Z","dependencies_parsed_at":"2023-09-25T03:52:53.637Z","dependency_job_id":null,"html_url":"https://github.com/KuoHaoZeng/Visual_Reaction","commit_stats":{"total_commits":7,"total_committers":1,"mean_commits":7.0,"dds":0.0,"last_synced_commit":"33614b7b22c2153dc0c847c5b1991540a6b53a36"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/KuoHaoZeng/Visual_Reaction","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KuoHaoZeng%2FVisual_Reaction","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KuoHaoZeng%2FVisual_Reaction/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KuoHaoZeng%2FVisual_Reaction/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KuoHaoZeng%2FVisual_Reaction/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/KuoHaoZeng","download_url":"https://codeload.github.com/KuoHaoZeng/Visual_Reaction/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KuoHaoZeng%2FVisual_Reaction/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279018775,"owners_count":26086452,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-14T02:00:06.444Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai2-thor","computer-vision","drone","forecasting","reinforcement-learning","visual-reaction"],"created_at":"2025-10-14T10:26:09.172Z","updated_at":"2025-10-14T10:26:10.387Z","avatar_url":"https://github.com/KuoHaoZeng.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## [Visual Reaction: Learning to Play Catch with Your Drone](https://arxiv.org/pdf/1912.02155.pdf)\n\nBy Kuo-Hao Zeng, Roozbeh Mottaghi, Luca Weihs, and Ali Farhadi\n\n[Paper](https://arxiv.org/pdf/1912.02155.pdf) | [Video](https://youtu.be/iyAoPuHxvYs) | [BibTex](#citing)\n\n![](figs/DroneCatch.gif)\n\nWe address the problem of Visual Reaction, where the idea is to forecast the future and plan accordingly. We study the task in the context of catching objects with a drone. An object is thrown in the air, and the drone should plan to catch it. Each object has different physical properties and might collide with other objects and structures in the scene, making the task quite challenging. \n\n### Citing\n\nIf you find this project useful in your research, please consider citing:\n\n```\n@inproceedings{khz2020visualreaction,\n  author = {Zeng, Kuo-Hao and Mottaghi, Roozbeh and Weihs, Luca and Farhadi, Ali},\n  title = {Visual Reaction: Learning to Play Catch with Your Drone},\n  booktitle = {CVPR},\t    \n  year = {2020}\n}\n```\n\n### Set Up\n\n0. Requirements\n\n   We implement this codebase on Ubuntu 18.04.3 LTS and also have tried it on Ubuntu 16.\n\n   In addition, this codebase needs to be executed on GPU(s).\n\n1. Clone this repository\n\n   ```\n   git clone git@github.com:KuoHaoZeng/Visual_Reaction.git\n   ```\n   \n2. Intsall `xorg` if the machine does not have it\n\n   **Note**: This codebase should be executed on GPU. Thus, we need xserver for GPU redering.\n\n   ```\n   # Need sudo permission to install xserver\n   sudo apt-get install xorg\n   ```\n\n   Then, do the xserver refiguration for GPU\n\n   ```\n   sudo python startx.py\n   ```\n\n4. Using `python 3.6`, create a `venv`\n\n   **Note**: The `python` version needs to be above `3.6`, since `python 2.x` may have issues with some required packages.\n   \n   ```\n   # Create venv and execute it\n   python -m venv venv \u0026\u0026 source venv/bin/activate\n   ```\n   \n4. Install the requirements with\n\n   ```\n   # Make sure you execute this under (venv) environment\n   pip install -r requirements.txt\n   ```\n\n### Environment/Dataset\n\nWe extend [AI2-THOR](http://ai2thor.allenai.org/) by adding a drone agent and a luncher to play\ncatch. After the luncher throws an object, the drone needs to\npredict the trajectory of the object from ego-centric observations\nand move to a position that can catch the object.\n\nWe collect a dataset consiting of 30 scenes (Living Rooms) in\nAI2-THOR. The initial positions of the drone and the luncher are random.\nThe luncher randomly select an object from 20 objects list, and throws\nit with random magnitudes in random directions. Overall, we collect 20K \ntraining trajectories, 5K validation trajectories, and 5K testing trajectories.\n\nThe [training data](data/train.json), [validation data](data/val.json), and [testing data](data/test.json) are available in the `data` folder.\n\nFor more information about how to control the drone agent in the environment, please vist this [Doc](https://ai2thor.allenai.org/ithor/documentation/).\n\n#### Generate your own data\n\n```\nTBD\n```\n\nAfter the data generation, you need to change the `data_dir` to your data folder in the config file:\n\n```\n...\nbase_dir: \"results/{{exp_prefix}}\"\ndata_dir: \"data\" \u003c-- change it to your data folder.\n...\n```\n\n### Play Catch with Your Drone!\n\n**Note**: You can always change or adjust the hyperparameters defined in the config file to change the setting such as how many GPUs are going to be used, how many threads are going to be used, how often you want to store a checkpoint, etc. You can also change the learning rate, number of iterations, batch size, etc. in the config file.\n\n#### Test the pretrained model on validation/testing set\n\n```\n# Download the pretrained model\nwget https://homes.cs.washington.edu/~khzeng/Visual_Reaction/pretrained.zip\nunzip pretrained.zip \u0026\u0026 mkdir results \u0026\u0026 mv pretrained/* results/ \u0026\u0026 rm pretrained.zip\n\n# Test the model\n# The testing results would be stored by jsonlines file\n# For the forecaster only\npython main.py --config configs/pretrained_forecaster_test.yaml\n# For the forecaster + action_sampler\npython main.py --config configs/pretrained_action_sampler_test.yaml\n\n# Evaluate the results\n# For the forecaster only\npython eval.py results/pretrained_forecaster test\n# For the forecaster + action_sampler\npython eval.py results/pretrained_action_sampler test\n```\n\n#### Train a new forecaster\n\n```\n# Train\npython main.py --config configs/forecaster_train.yaml\n\n# Validate or Test\npython main.py --config configs/forecaster_val.yaml\npython main.py --config configs/forecaster_test.yaml\n\n# Eval\npython eval.py results/forecaster val\npython eval.py results/forecaster test\n```\n\n#### Train a new action sampler with a trained forecaster\n\n```\n# Prepare the trained forecaster\ncd results\nmkdir action_sampler\nmkdir action_sampler/checkpoints\ncp -r forecaster/checkpoints/$FORECASTER_YOU_LIKE_TO_USE action_sampler/checkpoints/0000000\ncd ..\n\n# Train\npython main.py --config configs/action_sampler_train.yaml\n\n# Validate or Test\npython main.py --config configs/action_sampler_val.yaml\npython main.py --config configs/action_sampler_test.yaml\n\n# Eval\npython eval.py results/action_sampler val\npython eval.py results/action_sampler test\n```\n\n#### Main Results\n\n| Model  | Success Rate |\n| :-------------: | :-------------: |\n| Forecastor w/ uniform AS (ours) | 26.0 \u0026pm; 1.3 |\n| Forecastor w/ action sampler (ours) | 29.3 \u0026pm; 0.9 |\n| CPP w/ KF | 23.2 \u0026pm; 1.3 |\n| CPP | 22.9 \u0026pm; 2.3 |\n\n#### Different Mobility\n\n\u003cimg src=\"figs/mobility_result.jpeg\" style=\"zoom:150%;\" /\u003e\n\n#### Noisy Movement\n\n\u003cimg src=\"figs/noise_result.jpeg\" style=\"zoom:150%;\" /\u003e","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkuohaozeng%2Fvisual_reaction","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkuohaozeng%2Fvisual_reaction","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkuohaozeng%2Fvisual_reaction/lists"}