https://github.com/DHPR-dataset/DHPR-dataset

[IEEE-TIV 2024] Exploring the Potential of Multi-Modal AI for Driving Hazard Prediction
https://github.com/DHPR-dataset/DHPR-dataset

Last synced: 6 months ago
JSON representation

[IEEE-TIV 2024] Exploring the Potential of Multi-Modal AI for Driving Hazard Prediction

Host: GitHub
URL: https://github.com/DHPR-dataset/DHPR-dataset
Owner: DHPR-dataset
License: bsd-3-clause
Created: 2023-06-10T05:19:55.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-07-05T16:48:30.000Z (about 1 year ago)
Last Synced: 2024-09-27T17:01:53.308Z (10 months ago)
Homepage: https://ieeexplore.ieee.org/document/10568360
Size: 3 MB
Stars: 4
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

Awesome-LLM4AD - DHPR

README

        






  

    

  

  
Exploring the Potential of Multi-Modal AI for Driving Hazard Prediction


  


    DHPR: Driving Hazard Prediction and Reasoning

    


    Paper |

    Huggingface Dataset |

    Download Assets |

    Dataset Demo |

    Evaluation Server |

    Inference Demo

    


  



  Table of Contents

  


    

      Introduction

    

     

      Demo

    

    

      Data Files

    

    

      Evaluation

    

     

      Leaderboard

    

    License

  


## Introduction

This repository contains details about the DHPR (Driving Hazard Prediction and Reasoning) dataset. 

The DHPR dataset was introduced to solve the problem of predicting hazards that drivers may encounter while driving a car. We formulate it as visual abductive reasoning using a single input image captured by car dashcams.  

The dataset consists of:

* 14,975 street scenes 

* Car speeds 

* Hazard descriptions

* Visual entity descriptions (Oracle Scenario Only)

## Demo

Please find more details of the dataset in this Demo. 

## Data Files

Given the following file tree:

```

annotation_files

├── anno_train.json

└── anno_val.json

```

## Evaluation

To be updated.

    

## Leaderboard

To submit results, please upload the result file (To be updated).

#### Leaderboard of Results for the image retrieval (IR) and text retrieval (TR) tasks and the generation task on the DHPR test split. The retrieval tasks are evaluated by the average rank and Recall@1. The generation task is evaluated using BLEU (B4), ROUGE (R), CIDEr (C), SPIDER (S), and the GPT-4 score. For all metrics except the rank metric, higher values indicate better performance. For GPT-4V, we perform a zero-shot evaluation on the test split. 

| Model | Visual Encoder | IR Rank | IR R@1| TR Rank | TR R@1 | Text Decoder | B4 | R | C | S | GPT-4 |

| :---: | :---: | :---: | :---: | :---:  | :---: | :---:  | :---: | :---: | :---: | :---: | :---: |

| CLIP | ViT-L/14 | 10.8 | 24.1% | 10.9 | 24.8% | - | - | - | - | - | - |

| BLIP | ViT-B/16 | 15.3 | 9.3% | 15.9 | 8.1% | BERT | 12.6 | 32.9| 34.9 | 30.3 | 39.3| 

| BLIP2 | ViT-g/14 | 11.5 | 19.1% | 12.1 | 19.8% | OPT-6.7B | 18.7 | 42.7| 38.9 | 35.4| 50.5| 

| LLaVA-1.5 | ViT-L/14 | - | - | - | - | LLaMA-2 7B | 14.9 | 36.9| 34.5 | 30.9 | 56.2|

| GPT-4V | - | - | - | - | - | GPT-4 | 0.3 | 19.0| 0.9 | 7.2 | 50.0|

| Ours | ViT-L/14 | 10.2 | 24.9% | 10.3 | 26.3% | LLaMA-2 7B | 16.9 | 39.5 | 49.1 | 39.6 | 58.5 |

## License

The dataset used in this paper is licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license.

## Citation

```bibtex

@article{10568360,

  author={Charoenpitaks, Korawat and Nguyen, Van-Quang and Suganuma, Masanori and Takahashi, Masahiro and Niihara, Ryoma and Okatani, Takayuki},

  journal={IEEE Transactions on Intelligent Vehicles}, 

  title={Exploring the Potential of Multi-Modal AI for Driving Hazard Prediction}, 

  year={2024},

  volume={},

  number={},

  pages={1-11},

  keywords={Hazards;Cognition;Videos;Automobiles;Accidents;Task analysis;Natural languages;Vision;Language;Reasoning;Traffic Accident Anticipation},

  doi={10.1109/TIV.2024.3417353}

}

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/DHPR-dataset/DHPR-dataset

Awesome Lists containing this project

README

Exploring the Potential of Multi-Modal AI for Driving Hazard Prediction