An open API service indexing awesome lists of open source software.

https://github.com/dongyh20/Octopus

🐙Octopus, an embodied vision-language model trained with RLEF, emerging superior in embodied visual planning and programming.
https://github.com/dongyh20/Octopus

Last synced: 18 days ago
JSON representation

🐙Octopus, an embodied vision-language model trained with RLEF, emerging superior in embodied visual planning and programming.

Awesome Lists containing this project

README

        






Octopus: Embodied Vision-Language Programmer from Environmental Feedback


Jingkang Yang*,1
Yuhao Dong*,2,5
Shuai Liu*,3,5
Bo Li*,1

Ziyue Wang†,1
Chencheng Jiang†,4
Haoran Tan†,3
Jiamu Kang†,2

Yuanhan Zhang1
Kaiyang Zhou1
Ziwei Liu1,5,✉


1S-Lab, Nanyang Technological University 
2Tsinghua University 


3Beijing University of Posts and Telecommunications  


4Xi'an Jiaotong University 
5Shanghai AI Laboratory 

* Equal Contribution 
Equal Engineering Contribution 
Corresponding Author

-----------------

![](https://img.shields.io/badge/octopus-v0.1-darkcyan)
![](https://img.shields.io/github/stars/dongyh20/Octopus?style=social)
[![Hits](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2Fdongyh20%2FOctopus&count_bg=%23FFA500&title_bg=%23555555&icon=&icon_color=%23E7E7E7&title=visitors&edge_flat=false)](https://hits.seeyoufarm.com)
![](https://black.readthedocs.io/en/stable/_static/license.svg)
![](https://img.shields.io/badge/code%20style-black-000000.svg)

[Project Page](https://choiszt.github.io/Octopus) | [Octopus Paper](https://arxiv.org/abs/2310.08588) | [Demo Video](https://www.youtube.com/watch?v=tmSNw2XonxI)

## 🐙 Introducing Octopus
Octopus is a novel VLM designed to proficiently decipher an agent’s vision and textual task objectives and to formulate intricate action sequences and generate executable code. We provide two models based on the following architectures. Please click
- [LLaVA](./octopus/LLaVA/README.md)
- [Otter](./octopus/Otter/README.md)

## 🪸 Introducing OctoVerse
OctoVerse contains three sub-worlds
| | OS (tested) | Environment Goal |
|---------------|------------------|------------------------------------|
| **OctoGibson** | Ubuntu 20.04 | 500 Tasks on OmniGibson |
| **OctoGTA** | Windows 11 | 20 Tasks to evaluate transfer learning |
| **OctoMC** | Ubuntu/Windows | 20 Tasks to evaluate transfer learning on MineCraft worlds, such as making an axe. |

- Training data collection pipeline in `octogibson` environment
- Evaluation pipeline in `octogibson` environment
- Evaluation pipeline in `octogta` environment
- Training pipeline of the `octopus` model

**Contact: Leave issue or contact `[email protected]` and `[email protected]`. We are on call to respond.**

## 🦾 Updates

**[2023-10]**

1. 🤗 Introducing Project Octopus' homepage: https://choiszt.github.io/Octopus.
2. 🤗 Check our [paper](https://arxiv.org/abs/???) introducing Octopus in details.

## 🏁 Get Started
1. **Training Data Collection:** For data collection from `octogibson` environment, you need to set up two conda environments: `omnigibson` and `gpt4`. The `omnigibson` environment has an agent to act following the instruction from `gpt4` environment. Please checkout [here](octogibson/README.md) for detailed information.
2. **Evaluation in OctoGibson:** We provide the pipeline that the simulator sends messages to the Octopus server and gets responses to control the agent.
3. **Evaluation in OctoGTA:** We provide instructions, code, and MOD so that the Octopus can complete tasks in the GTA environment. Please checkout [here](octogta/README.md) for detailed information.
4. **Octopus Training:** We provide code for training Octopus. Please checkout [here](octopus/README.md) for detailed information.

## 📑 Citation

If you found this repository useful, please consider citing:
```
@article{yang2023octopus,
title = {Octopus: Embodied Vision-Language Programmer from Environmental Feedback},
author = {Jingkang Yang and Yuhao Dong and Shuai Liu and Bo Li and Ziyue Wang and Chencheng Jiang and Haoran Tan and Jiamu Kang and Yuanhan Zhang and Kaiyang Zhou and Ziwei Liu},
year = {2023},
}
```

### 👨‍🏫 Acknowledgements

We thank the [OmniGibson](https://github.com/StanfordVL/OmniGibson) team for their help and great contribution to the open-source community.