https://github.com/sabadijou/clld_official
Contrastive Learning for Lane Detection via cross-similarity
https://github.com/sabadijou/clld_official
contrastive-learning instance-segmentation lane-detection parallel-programming python pytorch self-supervised-learning
Last synced: 21 days ago
JSON representation
Contrastive Learning for Lane Detection via cross-similarity
- Host: GitHub
- URL: https://github.com/sabadijou/clld_official
- Owner: sabadijou
- Created: 2023-02-12T14:13:24.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-12-10T03:41:08.000Z (over 1 year ago)
- Last Synced: 2025-05-12T19:22:14.581Z (21 days ago)
- Topics: contrastive-learning, instance-segmentation, lane-detection, parallel-programming, python, pytorch, self-supervised-learning
- Language: Python
- Homepage: https://www.mdu.se/en/malardalen-university/centre-for-industrial-digitalisation/projects/autodeep-research-project
- Size: 2 MB
- Stars: 6
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: ReadMe.md
Awesome Lists containing this project
README
# Contrastive Learning for Lane Detection via Cross-Similarity
>[**Contrastive Learning for Lane Detection via Cross-Similarity**](https://arxiv.org/abs/2308.08242)
>[](https://arxiv.org/abs/2308.08242)## Overview of CLLD
Contrastive Learning for Lane Detection via cross-similarity (CLLD), is a self-supervised learning method that tackles this challenge by enhancing lane detection models’ resilience to real-world conditions that cause lane low visibility. CLLD is a novel multitask contrastive learning that trains lane detection approaches to detect lane markings even in low visible situations by integrating local feature contrastive learning (CL) with our new proposed operation cross-similarity. To ease of understanding some details are listed in the following:
- CLLD employs similarity learning to improve the performance of deep neural networks in lane detection, particularly in challenging scenarios.
- The approach aims to enhance the knowledge base of neural networks used in lane detection.
- Our experiments were carried out using `ImageNet` as a pretraining dataset. We employed pioneering lane detection models like `RESA`, `CLRNet`, and `UNet`, to evaluate the impact of our approach on model performances.
CLLD architecture## Get started
1. Clone the repository
```
git clone https://github.com/sabadijou/clld_official.git
```
We call this directory as `$RESA_ROOT`2. Create an environment and activate it (We've used conda. but it is optional)
```Shell
conda create -n clld python=3.9 -y
conda activate clld
```3. Install dependencies
```Shell
# Install pytorch firstly, the cudatoolkit version should be same in your system. (you can also use pip to install pytorch and torchvision)
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
# Install kornia and einops
pip install kornia
pip install einops# Install other dependencies
pip install -r requirements.txt
```
## How to Run CLLD
We conducted pretraining using the training data from `ImageNet`. However, you are free to utilize other datasets and configurations as needed. The configuration file for our approach can be found in the `configs` folder.Once the dataset and new configurations are in place, you can execute the approach using the following command:
```Shell
python main.py --dataset_path /Imagenet/train --encoder resnet50 --alpha 1 --batch_size 1024 --world_size 1 --gpus_id 0 1
```
The following is a quick guide on arguments:
- `dataset_path`: Path to training data directory
- `encoder`: Select an encoder for training. `resnet18`, `resnet34`, `resnet50`, `resnet101`, `resnet152`, `resnext50_32x4d`,`resnext101_32x8d`, `wide_resnet50_2`, `wide_resnet101_2`.
- `alpha`: Cross similarity window size
- `batch_size`: Select a batch size that suits the GPU infrastructure you are using.
- `world_size`: For example, if you are training a model on a single machine with 4 GPUs, the world size is 4. If you have 2 machines, each with 4 GPUs, and you use all of them for training, the world size would be 8.
- `gpus_id`: Please specify all the GPU IDs that you used for training the approach.
## How to publish weights
Upon completing the training phase, you can execute the command below to prepare the trained weights for use as prior knowledge in the backbone of a lane detection model.
```Shell
python main.py --checkpoint path/to/checkpoint --encoder resnet50
```## Our experiments
We specifically chose to evaluate CLLD with U-Net because it is a common encoder-decoder architecture used in various methods that approach lane detection as a segmentation-based problem. In addition, we tested our method using RESA, which is currently the state-of-the-art semantic segmentation lane detection method that is not based on the UNet architecture.This independent validation is necessary to ensure the accuracy of our results. Lastly, we evaluated CLLD using CLRNet, a leading anchor-based lane detection method.
Visualized resultsPerformance of UNet on CuLane and TuSimple with different contrastive learnings.
Method
# Epoch
Precision (CuLane)
Recall (CuLane)
F1-measure (CuLane)
Accuracy (TuSimple)
PixPro
100
73.68
67.15
70.27
95.92
VICRegL
300
67.75
63.43
65.54
93.58
DenseCL
200
63.8
58.4
60.98
96.13
MoCo-V2
200
63.08
57.74
60.29
96.04
CLLD (α=1)
100
71.98
69.2
70.56
95.9
CLLD (α=2)
100
70.69
69.36
70.02
95.98
CLLD (α=3)
100
71.31
69.59
70.43
96.17
Performance of RESA on CuLane and TuSimple with different contrastive learnings.
Method
# Epoch
Precision (CuLane)
Recall (CuLane)
F1-measure (CuLane)
Accuracy (TuSimple)
PixPro
100
77.41
73.69
75.51
96.6
VICRegL
300
76.27
69.58
72.77
96.18
DenseCL
200
77.67
73.51
75.53
96.28
MoCo-V2
200
78.12
73.36
75.66
96.56
CLLD (α=1)
100
79.01
72.99
75.88
96.74
CLLD (α=2)
100
78
73.45
75.66
96.78
CLLD (α=3)
100
78.34
74.29
76.26
96.81
Performance of CLRNet on CLRNet and TuSimple with different contrastive learnings.
Method
# Epoch
Precision (CuLane)
Recall (CuLane)
F1-measure (CuLane)
Accuracy (TuSimple)
PixPro
100
89.19
70.39
78.67
93.88
VICRegL
300
87.72
71.15
78.72
89.01
DenseCL
200
88.07
69.67
77.8
85.15
MoCo-V2
200
88.91
71.02
78.96
93.87
CLLD (α=1)
100
88.72
71.33
79.09
90.68
CLLD (α=2)
100
87.95
71.44
78.84
93.48
CLLD (α=3)
100
88.59
71.73
79.27
94.25
## Acknowledgement
* [RESA](https://openaccess.thecvf.com/content/CVPR2021/html/Xie_Propagate_Yourself_Exploring_Pixel-Level_Consistency_for_Unsupervised_Visual_Representation_Learning_CVPR_2021_paper.html)
* [CLRNet](https://github.com/Turoad/CLRNet/tree/main)
* [UNet](https://arxiv.org/abs/1505.04597)