https://github.com/ai4ce/citywalker

[CVPR 2025] CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos
https://github.com/ai4ce/citywalker

cvpr2025 embodied-ai embodied-navigation imitation-learning outdoor-navigation point-goal-navigation quadruped robot-learning scaling-law urban-navigation video-learning visual-navigation

Last synced: about 2 months ago
JSON representation

[CVPR 2025] CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos

Host: GitHub
URL: https://github.com/ai4ce/citywalker
Owner: ai4ce
License: apache-2.0
Created: 2024-10-07T04:03:21.000Z (8 months ago)
Default Branch: main
Last Pushed: 2025-04-03T16:48:39.000Z (2 months ago)
Last Synced: 2025-04-03T17:35:41.698Z (2 months ago)
Topics: cvpr2025, embodied-ai, embodied-navigation, imitation-learning, outdoor-navigation, point-goal-navigation, quadruped, robot-learning, scaling-law, urban-navigation, video-learning, visual-navigation
Language: Python
Homepage: https://ai4ce.github.io/CityWalker/
Size: 82.8 MB
Stars: 66
Watchers: 2
Forks: 5
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # [CVPR 2025] CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos

### TL;DR: CityWalker leverages thousands of hours of online city walking and driving videos to train autonomous agents for robust, generalizable navigation in dynamic urban environments through scalable, data-driven imitation learning.

  

[Xinhao Liu](https://gaaaavin.github.io/)\*,

[Jintong Li](.)\*, 

[Yicheng Jiang](.),

[Niranjan Sujay](.),

[Zhicheng Ynag](.),

[Juexiao Zhang](https://juexzz.github.io/),

[John Abanes](.),

[Jing Zhang](https://jingz6676.github.io/), 

[Chen Feng](https://engineering.nyu.edu/faculty/chen-feng)†

![](./src/nav.gif)

**Checkout a mosaic demo of our dataset:**

https://github.com/user-attachments/assets/02f57a2b-f2d2-4638-a8b0-d837d219735f

# Getting Started

## Installation

The project should be compatible with latest Pytorch and CUDA versions. The code is tested with Python 3.11, PyTorch 2.5.0, and CUDA 12.1. To install the dependencies, run:

```

conda env create -f environment.yml

conda activate citywalker

```

## Data Preparation

Please see [dataset/README.md](./dataset/README.md) for details on how to prepare the dataset.

## Training

To train the model, run:

```

python train.py --config config/citywalk_2000hr.yaml

```

We provide our **pretrained model** in the [releases tab](https://github.com/ai4ce/CityWalker/releases).

## Fine-tuning

To fine-tune the model, run:

```

python fine_tune.py --config config/finetune.yaml --checkpoint 

```

## Testing

To test the model, run:

```

python test.py --config config/finetune.yaml --checkpoint 

```

# Citation

```

@article{liu2024citywalker,

  title={CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos},

  author={Liu, Xinhao and Li, Jintong and Jiang, Yicheng and Sujay, Niranjan and Yang, Zhicheng and Zhang, Juexiao and Abanes, John and Zhang, Jing and Feng, Chen},

  journal={arXiv preprint arXiv:2411.17820},

  year={2024}

}

```

# Acknowledgements

The work was supported by NSF grants 2238968, 2121391, 2322242 and 2345139; and in part through the NYU IT High Performance Computing resources, services, and staff expertise. We thank Xingyu Liu and Zixuan Hu for their help in data collection.

We also thank the authors of the following repositories for their open-source implementations:

* [ViNT: A Foundation Model for Visual Navigation](https://github.com/robodhruv/visualnav-transformer), CoRL 2023

* [NoMaD: Goal Masked Diffusion Policies for Navigation and Exploration](https://github.com/robodhruv/visualnav-transformer), ICRA 2024

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ai4ce/citywalker

Awesome Lists containing this project

README