https://github.com/google-deepmind/aloha_sim
A collection of tabletop tasks in Mujoco
https://github.com/google-deepmind/aloha_sim
robotics simulation
Last synced: 4 months ago
JSON representation
A collection of tabletop tasks in Mujoco
- Host: GitHub
- URL: https://github.com/google-deepmind/aloha_sim
- Owner: google-deepmind
- License: other
- Created: 2025-06-14T09:29:05.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-06-30T19:27:25.000Z (7 months ago)
- Last Synced: 2025-07-23T21:16:19.799Z (6 months ago)
- Topics: robotics, simulation
- Language: Python
- Homepage:
- Size: 50.2 MB
- Stars: 208
- Watchers: 2
- Forks: 19
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# Aloha Sim
Aloha Sim is a python library that defines the sim environment for the Aloha
robot. It includes a collection of tasks for robot learning and evaluation.
## Installation
Install with pip:
```bash
# create a virtual environment and pip install
pip install -e .
```
**OR** run directly with uv:
```bash
pip install uv
uv run .py
```
Tell mujoco which backend to use, otherwise the simulation will be very slow
```bash
export MUJOCO_GL='egl'
```
## Viewer
Interact with the scene without a policy:
```bash
python aloha_sim/viewer.py --policy=no_policy --task_name=HandOverBanana
```
## Tests
```bash
# individual tests
python aloha_sim/tasks/test/aloha2_task_test.py
python aloha_sim/tasks/test/hand_over_test.py
...
# all tests
python -m unittest discover aloha_sim/tasks/test '*_test.py'
```
## Inference
⚠️ **For Gemini Robotics Trusted Testers Only**
Inference with Gemini Robotics models is intended for
Trusted Testers. If you are not a Trusted Tester, sign up
[here](https://deepmind.google/models/gemini-robotics/).
Follow our [SDK documentation](https://github.com/google-deepmind/gemini-robotics-sdk)
to serve the model. The **same model** used for real-world evaluations can be
directly applied in simulation.
Checkout the walkthrough video:
<p align="middle">
<a href="https://www.youtube.com/watch?v=nVMY3-kWhOc
" target="_blank"><img src="media/walkthrough.png"
alt="Video walkthrough of sim eval" width="718" height="403" border="10"/></a>
</p>
### Install SDK dependency
```bash
pip install aloha_sim[inference]
```
### Interactive Rollouts
Start the viewer with a chosen task:
```bash
# defaut task: "put the banana in the bowl"
python aloha_sim/viewer.py
# "remove the cap from the marker"
python aloha_sim/viewer.py --task_name=MarkerRemoveLid
# "place the can opener in the left compartment of the caddy"
python aloha_sim/viewer.py --task_name=ToolsPlaceCanOpenerInLeftCompartment
...
```
Checkout [task_suite.py](aloha_sim/task_suite.py#L42) for the list of all tasks
available.
You can use the viewer to pause/resume the environment,
interact with the objects, and enter new instructions for the robot
```
Instructions for using the viewer:
- shift + 'i' = enter new instruction
- space bar = pause/resume.
- backspace = reset environment.
- mouse right moves the camera
- mouse left rotates the camera
- double-click to select an object
When the environment is not running:
- ctrl + mouse left rotates a selected object
- ctrl + mouse right moves a selected object
When the environment is running:
- ctrl + mouse left applies torque to an object
- ctrl + mouse right applies force to an object
```
### Eval
```bash
python aloha_sim/run_eval.py
```
Runs N evaluation episodes for all tasks and save videos
in `/tmp/`.
## Benchmark
Success Rates (with 95% Confidence Interval) from 100 episodes per task x 3
runs.
### Basic Tasks
| Task | Gemini Robotics On Device<br>Success Rate (95% CI) |
|:------|:-----------------------:|
| BowlOnRack | 99.3 (0.93) |
| DrawerOpen | 87.0 (3.83) |
| HandOverBanana | 99.0 (1.13) |
| HandOverPen | 93.7 (2.77) |
| LaptopClose | 78.0 (4.71) |
### Instruction-Following Tasks
| Task | Gemini Robotics On Device<br>Success Rate (95% CI) |
|:------|:-----------------------:|
| DiningPlaceBananaInBowl | 93.3 (2.84) |
| DiningPlaceMugOnPlate | 32.0 (5.31) |
| DiningPlacePenInContainer | 19.7 (4.52) |
| ToolsPlaceCanOpenerInLeftCompartment | 82.0 (4.37) |
| ToolsPlaceCanOpenerInRightCompartment | 73.3 (5.03) |
| ToolsPlaceMagnifierInRightCompartment | 91.3 (3.20) |
| ToolsPlaceMagnifierInLeftCompartment | 81.7 (4.40) |
| ToolsPlaceScissorsInLeftCompartment | 73.3 (5.03) |
| ToolsPlaceScissorsInRightCompartment | 79.0 (4.64) |
| ToolsPlaceScrewdriverInLeftCompartment | 81.0 (4.46) |
| ToolsPlaceScrewdriverInRightCompartment | 78.3 (4.69) |
| BlocksSpelling | 4.7 (2.40) |
### Dexterous Tasks
| Task | Gemini Robotics On Device<br>Success Rate (95% CI) |
|:------|:-----------------------:|
| MarkerRemoveLid | 73.7 (5.01) |
| DesktopWrapHeadphone | 8.7 (3.21) |
| TowelFoldInHalf | 6.7 (3.18) |
To reproduce the results, use `random_seed = 42 + episode_index`.
## Tips
- If the environment stepping is very slow, check that you are using the right
backend, e.g. `MUJOCO_GL='egl'`
- Tasks with deformable objects like `DesktopWrapHeadphone` and
`TowelFoldInHalf` are slow to simulate and interact directly with `viewer.py`.
## Note
This is not an officially supported Google product.
