Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/zzxslp/MM-Navigator

GPT-4V in Wonderland: LMMs as Smartphone Agents
https://github.com/zzxslp/MM-Navigator

gpt4v llm-agents web-navigation

Last synced: 22 days ago
JSON representation

GPT-4V in Wonderland: LMMs as Smartphone Agents

Awesome Lists containing this project

README

        

# GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation
[GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation](https://arxiv.org/pdf/2311.07562.pdf)

Our code and evaluation benchmark will be out soon!

## Demo
A demo figure using GPT-4V to shop on the Amazon app with an iphone:



## Citation

If you find our work helpful to your research, please consider citing the paper:

```
@article{yan2023gpt,
title={GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation},
author={Yan, An and Yang, Zhengyuan and Zhu, Wanrong and Lin, Kevin and Li, Linjie and Wang, Jianfeng and Yang, Jianwei and Zhong, Yiwu and McAuley, Julian and Gao, Jianfeng and others},
journal={arXiv preprint arXiv:2311.07562},
year={2023}
}
```