https://github.com/tumftm/dpft

Dual Perspective Fusion Transformer for Camera-Radar-based Object Detection
https://github.com/tumftm/dpft

autonomous-driving camera object-detection perception radar sensor-fusion

Last synced: 6 months ago
JSON representation

Dual Perspective Fusion Transformer for Camera-Radar-based Object Detection

Host: GitHub
URL: https://github.com/tumftm/dpft
Owner: TUMFTM
License: apache-2.0
Created: 2024-03-27T10:55:26.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-01-25T12:08:45.000Z (9 months ago)
Last Synced: 2025-04-02T08:49:01.962Z (6 months ago)
Topics: autonomous-driving, camera, object-detection, perception, radar, sensor-fusion
Language: Python
Homepage:
Size: 812 KB
Stars: 74
Watchers: 6
Forks: 6
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          


DPFT


Dual Perspective Fusion Transformer

[![Linux](https://img.shields.io/badge/os-linux-blue.svg)](https://www.linux.org/)

[![Docker](https://badgen.net/badge/icon/docker?icon=docker&label)](https://www.docker.com/)

[![Python](https://img.shields.io/badge/python-3-blue.svg)](https://www.python.org/downloads/)

[![arXiv](https://img.shields.io/badge/arXiv-Paper-blue.svg)](https://arxiv.org/abs/2404.03015)

![](/docs/figs/DPFT_Opening_Figure.jpg "DPFT Opening Figure")



## 📄 Overview

The perception of autonomous vehicles has to be efficient, robust, and cost-effective. However, cameras are not robust against severe weather conditions, lidar sensors are expensive, and the performance of radar-based perception is still inferior to the others. Camera-radar fusion methods have been proposed to address this issue, but these are constrained by the typical sparsity of radar point clouds and often designed for radars without elevation information. We propose a novel camera-radar fusion approach called Dual Perspective Fusion Transformer (DPFT), designed to overcome these limitations. Our method leverages lower-level radar data (the radar cube) instead of the processed point clouds to preserve as much information as possible and employs projections in both the camera and ground planes to effectively use radars with elevation information and simplify the fusion with camera data. As a result, DPFT has demonstrated state-of-the-art performance on the K-Radar dataset while showing remarkable robustness against adverse weather conditions and maintaining a low inference time.

## 🏆 Results

#### 3D object detection on the K-Radar dataset

| Model | Modality | Total | Normal | Overcast | Fog  | Rain | Sleet | LightSnow | HeavySnow | Revision | Checkpoint                                                              |

|-------|----------|-------|--------|----------|------|------|-------|-----------|-----------|----------|-------------------------------------------------------------------------|

| DPFT  | C + R    | 56.1  | 55.7   | 59.4     | 63.1 | 49.0 | 51.6  | 50.5      | 50.5      | v1.0     | [Link](https://zenodo.org/records/14738706/files/20240203-232344-241_checkpoint_0122.pt?download=1) |

| DPFT  | C + R    | 50.5  | 51.1   | 45.2     | 64.2 | 39.9 | 42.9  | 42.4      | 51.1      | v2.0     | [Link](https://zenodo.org/records/14738706/files/20240220-123248-537_checkpoint_0049.pt?download=1) |

## 💿 Dataset

This project is based on the [K-Radar](https://github.com/kaist-avelab/K-Radar) dataset. To set it up correctly, you should follow these two steps:

1. Get the dataset from https://github.com/kaist-avelab/K-Radar

2. Structure the dataset accordly

    

    K-Radar Data Structure

    ```

    .

    |

    +---data/

    |   |

    |   +---kradar/

    |   |   |

    |   |   +---raw/

    |   |   |   |

    |   |   |   +---1/

    |   |   |   |   |

    |   |   |   |   +---cam-front/

    |   |   |   |   |

    |   |   |   |   +---cam-left/

    |   |   |   |   |

    |   |   |   |   +---cam-rear/

    |   |   |   |   |

    |   |   |   |   +---cam-right/

    |   |   |   |   |

    |   |   |   |   +---description.txt

    |   |   |   |   |

    |   |   |   |   +---info_calib/

    |   |   |   |   |

    |   |   |   |   +---info_frames/

    |   |   |   |   |

    |   |   |   |   +---info_label/

    |   |   |   |   |

    |   |   |   |   +---info_label_v2/

    |   |   |   |   |

    |   |   |   |   +---info_matching/

    |   |   |   |   |

    |   |   |   |   +---os1-128/

    |   |   |   |   |

    |   |   |   |   +---os2-64/

    |   |   |   |   |

    |   |   |   |   +---radar_tesseract/

    |   |   |   |   |

    |   |   |   |   +---...

    |   |   |   |

    |   |   |   +---2, 3... 

    ```

    

## 💾 Install

We recommend using a docker based installation to ensure a consistent development environment but also provide instructions for a local installation. Therefore, check our more detailed [installation instructions](/docs/INSTALL.md)

```

docker build -t dprt:0.0.1 .

```

```

docker run \

    --name dprt \

    -it \

    --gpus all \

    -e DISPLAY \

    -v /tmp/.X11-unix:/tmp/.X11-unix \

    -v :/app \

    -v :/data \

    dprt:0.0.1 bash

```

## 🔨 Usage

The usage of our model consists of three major steps.

### 1. Prepare

First, you have to prepare the training and evaluation data by pre-processing the raw dataset. This will not only deduce the essential information from the original dataset but also reduces the data size from 16 TB to only 670 GB.

```

python -m dprt.prepare --src /data/kradar/raw/ --cfg /app/config/kradar.json --dst /data/kradar/processed/

```

```

python -m dprt.prepare 

  --src 

  --cfg 

  --dst 

```

### 2. Train

Second, train the DPFT model on the previously prepared data or continue with a specific model training.

```

python -m dprt.train --src /data/kradar/processed/ --cfg /app/config/kradar.json

```

```

python -m dprt.train

  --src 

  --cfg 

  --dst 

  --checkpoint 

```

### 3. Evaluate

Third, evaluate the model performance of a previously trained model checkpoint.

```

python -m dprt.evaluate --src /data/kradar/processed/ --cfg /app/config/kradar.json --checkpoint /app/log/

```

```

python -m dprt.evaluate 

  --src 

  --cfg 

  --dst 

  --checkpoint 

```

## 📃 Citation

If DPFT is useful or relevant to your research, please kindly recognize our contributions by citing our paper:

```bibtex

@article{fent2024dpft,

    title={DPFT: Dual Perspective Fusion Transformer for Camera-Radar-based Object Detection}, 

    author={Felix Fent and Andras Palffy and Holger Caesar},

    journal={arXiv preprint arXiv:2404.03015},

    year={2024}

}

```

## ⁉️ FAQ

No CUDA runtime is found

1. Install nvidia-container-runtime

    ```

    sudo apt-get install nvidia-container-runtime

    ```

2. Edit/create the /etc/docker/daemon.json with content:

    ```

    {

        "runtimes": {

            "nvidia": {

                "path": "/usr/bin/nvidia-container-runtime",

                "runtimeArgs": []

            } 

        },

        "default-runtime": "nvidia" 

    }

    ```

3. Remove Docker CLI plugin for extended build capabilities:

    ```

    sudo apt remove docker-buildx-plugin

    ```

4. Restart docker daemon:

    ```

    sudo systemctl restart docker

    ```

5. Build Docker image:

    ```

    docker build -t dprt:0.0.1 .

    ```

    Reference: https://stackoverflow.com/questions/59691207/docker-build-with-nvidia-runtime

fatal error: cusolverDn.h: No such file or directory

1. Export CUDA path:

    ```

    export PATH=/usr/local/cuda/bin:$PATH

    ```

    Reference: https://github.com/microsoft/DeepSpeed/issues/2684

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/tumftm/dpft

Awesome Lists containing this project

README

DPFT