https://github.com/OpenDriveLab/DriveLM

[ECCV 2024 Oral] DriveLM: Driving with Graph Visual Question Answering
https://github.com/OpenDriveLab/DriveLM

autonomous-driving chain-of-thought graph-of-thoughts large-language-models llm prompt-engineering prompting tree-of-thoughts vision-language

Last synced: 8 months ago
JSON representation

[ECCV 2024 Oral] DriveLM: Driving with Graph Visual Question Answering

Host: GitHub
URL: https://github.com/OpenDriveLab/DriveLM
Owner: OpenDriveLab
License: apache-2.0
Created: 2023-08-08T12:07:33.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2025-03-04T16:56:25.000Z (8 months ago)
Last Synced: 2025-03-10T14:11:26.934Z (8 months ago)
Topics: autonomous-driving, chain-of-thought, graph-of-thoughts, large-language-models, llm, prompt-engineering, prompting, tree-of-thoughts, vision-language
Language: HTML
Homepage: https://opendrivelab.com/DriveLM/
Size: 274 MB
Stars: 998
Watchers: 22
Forks: 65
Open Issues: 21
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Citation: CITATION.cff

Awesome Lists containing this project

Awesome-Reasoning-Foundation-Models - [Code
Awesome-Multimodal-LLM-Autonomous-Driving - DriveLM 2023
awesome-knowledge-driven-AD - DriveLM - Score | (:books: Papers / Dataset \& Benchmark)
Awesome-LLM4AD - DriveLM: Drive on Language
awesome-and-novel-works-in-slam - [Code

README

          




  



    

**DriveLM:** *Driving with **G**raph **V**isual **Q**uestion **A**nswering*

`Autonomous Driving Challenge 2024` **Driving-with-Language** [Leaderboard](https://opendrivelab.com/challenge2024/#driving_with_language).





[![](https://img.shields.io/badge/Project%20Page-8A2BE2)](https://opendrivelab.com/DriveLM/)

[![License: Apache2.0](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](#licenseandcitation)

[![arXiv](https://img.shields.io/badge/arXiv-2312.14150-b31b1b.svg)](https://arxiv.org/abs/2312.14150)

[![](https://img.shields.io/badge/Latest%20release-v1.1-yellow)](#gettingstarted)

[![Hugging Face](https://img.shields.io/badge/Test%20Server-%F0%9F%A4%97-ffc107?color=ffc107&logoColor=white)](https://huggingface.co/spaces/AGC2024/driving-with-language-official)



https://github.com/OpenDriveLab/DriveLM/assets/54334254/cddea8d6-9f6e-4e7e-b926-5afb59f8dce2

## Highlights 

🔥 We instantiate datasets (**DriveLM-Data**) built upon nuScenes and CARLA, and propose a VLM-based baseline approach (**DriveLM-Agent**) for jointly performing **Graph VQA** and end-to-end driving. 

🏁 **DriveLM** serves as a main track in the [**`CVPR 2024 Autonomous Driving Challenge`**](https://opendrivelab.com/challenge2024/#driving_with_language). Everything you need for the challenge is [HERE](https://github.com/OpenDriveLab/DriveLM/tree/main/challenge), including baseline, test data and submission format and evaluation pipeline!



  



## News 

- **`[2025/01/08]`** [Drive-Bench](https://drive-bench.github.io/) release! In-depth analysis in what are DriveLM really benchmarking. Take a look at [arxiv](https://arxiv.org/pdf/2501.04003).

- **`[2024/07/16]`** DriveLM [official leaderboard](https://huggingface.co/spaces/AGC2024/driving-with-language-official) reopen!

- **`[2024/07/01]`** DriveLM got accepted to ECCV 2024! Congrats to the team!

- **`[2024/06/01]`** Challenge ended up! [See the final leaderboard](https://opendrivelab.com/challenge2024/#driving_with_language).

- **`[2024/03/25]`** Challenge test server is online and the test questions are released. [Chekc it out!](https://github.com/OpenDriveLab/DriveLM/tree/main/challenge)

- **`[2024/02/29]`** Challenge repo release. Baseline, data and submission format, evaluation pipeline. [Have a look!](https://github.com/OpenDriveLab/DriveLM/tree/main/challenge)

- **`[2023/08/25]`** DriveLM-nuScenes demo released.

- **`[2023/12/22]`** DriveLM-nuScenes full `v1.0` and [paper](https://arxiv.org/abs/2312.14150) released.

## Table of Contents

1. [Highlights](#highlight)

2. [Getting Started](#gettingstarted)

   - [Prepare DriveLM-nuScenes](docs/data_prep_nus.md)  

3. [Current Endeavors and Future Horizons](#timeline)

4. [TODO List](#newsandtodolist)

5. [DriveLM-Data](#drivelmdata)

   - [Comparison and Stats](#comparison)

   - [GVQA Details](docs/gvqa.md)

   - [Annotation and Features](docs/data_details.md)

6. [License and Citation](#licenseandcitation)

7. [Other Resources](#otherresources)

## Getting Started 

To get started with DriveLM: 

- [Prepare DriveLM-nuScenes](/docs/data_prep_nus.md)

- [Challenge devkit](/challenge/)

- [More content coming soon](#todolist)

(back to top)


## Current Endeavors and Future Directions  

> - The advent of GPT-style multimodal models in real-world applications motivates the study of the role of language in driving.

> - Date below reflects the arXiv submission date.

> - If there is any missing work, please reach out to us!



  



DriveLM attempts to address some of the challenges faced by the community.

- **Lack of data**: DriveLM-Data serves as a comprehensive benchmark for driving with language.

- **Embodiment**: GVQA provides a potential direction for embodied applications of LLMs / VLMs.

- **Closed-loop**: DriveLM-CARLA attempts to explore closed-loop planning with language.

(back to top)


## TODO List 

- [x] DriveLM-Data

  - [x] DriveLM-nuScenes

  - [x] DriveLM-CARLA

- [x] DriveLM-Metrics

  - [x] GPT-score

- [ ] DriveLM-Agent

  - [x] Inference code on DriveLM-nuScenes

  - [ ] Inference code on DriveLM-CARLA

(back to top)


## DriveLM-Data 

We facilitate the `Perception, Prediction, Planning, Behavior, Motion` tasks with human-written reasoning logic as a connection between them. We propose the task of [GVQA](docs/gvqa.md) on the DriveLM-Data. 

### 📊 Comparison and Stats 

**DriveLM-Data** is the *first* language-driving dataset facilitating the full stack of driving tasks with graph-structured logical dependencies.



  



Links to details about [GVQA task](docs/gvqa.md), [Dataset Features](docs/data_details.md/#features), and [Annotation](docs/data_details.md/#annotation).

(back to top)


## License and Citation 

All assets and code in this repository are under the [Apache 2.0 license](./LICENSE) unless specified otherwise. The language data is under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). Other datasets (including nuScenes) inherit their own distribution licenses. Please consider citing our paper and project if they help your research.

```BibTeX

@article{sima2023drivelm,

  title={DriveLM: Driving with Graph Visual Question Answering},

  author={Sima, Chonghao and Renz, Katrin and Chitta, Kashyap and Chen, Li and Zhang, Hanxue and Xie, Chengen and Luo, Ping and Geiger, Andreas and Li, Hongyang},

  journal={arXiv preprint arXiv:2312.14150},

  year={2023}

}

```

```BibTeX

@misc{contributors2023drivelmrepo,

  title={DriveLM: Driving with Graph Visual Question Answering},

  author={DriveLM contributors},

  howpublished={\url{https://github.com/OpenDriveLab/DriveLM}},

  year={2023}

}

```

(back to top)


## Other Resources 



    

  

**OpenDriveLab**

- [DriveAGI](https://github.com/OpenDriveLab/DriveAGI) | [UniAD](https://github.com/OpenDriveLab/UniAD) | [OpenLane-V2](https://github.com/OpenDriveLab/OpenLane-V2) | [Survey on E2EAD](https://github.com/OpenDriveLab/End-to-end-Autonomous-Driving)

- [Survey on BEV Perception](https://github.com/OpenDriveLab/BEVPerception-Survey-Recipe) | [BEVFormer](https://github.com/fundamentalvision/BEVFormer) | [OccNet](https://github.com/OpenDriveLab/OccNet)



    

  

**Autonomous Vision Group**

- [tuPlan garage](https://github.com/autonomousvision/tuplan_garage) | [CARLA garage](https://github.com/autonomousvision/carla_garage) | [Survey on E2EAD](https://github.com/OpenDriveLab/End-to-end-Autonomous-Driving)

- [PlanT](https://github.com/autonomousvision/plant) | [KING](https://github.com/autonomousvision/king) | [TransFuser](https://github.com/autonomousvision/transfuser) | [NEAT](https://github.com/autonomousvision/neat)

(back to top)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/OpenDriveLab/DriveLM

Awesome Lists containing this project

README