https://github.com/manuelz/dlpt-semantic-segmentation

Project #4 for the OpenCV University course "Deep Learning with PyTorch".
https://github.com/manuelz/dlpt-semantic-segmentation

deep-learning python pytorch semantic-segmentation

Last synced: 3 months ago
JSON representation

Project #4 for the OpenCV University course "Deep Learning with PyTorch".

Host: GitHub
URL: https://github.com/manuelz/dlpt-semantic-segmentation
Owner: ManuelZ
Created: 2024-08-16T21:11:44.000Z (11 months ago)
Default Branch: main
Last Pushed: 2025-02-05T05:00:29.000Z (5 months ago)
Last Synced: 2025-02-13T17:18:07.988Z (5 months ago)
Topics: deep-learning, python, pytorch, semantic-segmentation
Language: Jupyter Notebook
Homepage:
Size: 202 MB
Stars: 0
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # Multi-Class Semantic Segmentation of Drone Imagery

This is the fourth project of the Opencv University course ["Deep Learning with PyTorch"](https://opencv.org/university/deep-learning-with-pytorch/).

It focuses on applying semantic segmentation on images taken from drones to differentiate between 12 classes.

## Introduction

Semantic segmentation is a foundational computer vision task that assigns a class label to every pixel in an image. This project applies segmentation to drone imagery, addressing challenges such as diverse classes and imbalanced data to classify pixels into 12 categories. Potential applications include autonomous navigation and environmental monitoring.

## Data

The project uses a dataset of 3269 images of size 1280(W) x 720(H), taken by drones, and annotated image masks for 

the following 12 classes: 

    background, person, bike, car, drone, boat, animal, obstacle, construction, vegetation, road, sky

Examples:

![img.png](media/img.png)

One of the challenges of this dataset is the class imbalance. The following image shows the number of pixels per class across the whole dataset:

![alt text](media/pixel_count.png)

## The methods used

Fine-tuning a pre-trained DeepLabV3 ResNet-101 model using a custom training loop in PyTorch. The primary objective was to gain hands-on experience with implementing all the steps of the training process in PyTorch.

- The dataset was split using a stratified shuffle split scheme into train and validation subsets with 80% and 20% of the 

available data, respectively. The stratification was done based on the presence or not of a class in each image. 

- Various loss functions were tested, including:

  - The Focal Loss: is a modification of the Cross-Entropy loss focused on learning from hard negative examples.

  - The Soft Dice Loss: is effective in addressing the challenge of imbalanced foreground and background regions.

  - An equally weighted combination of the Focal Loss and the Soft Dice Loss.

  - The Tversky Loss: An improvement over the Dice loss

- A learning rate scheduler that implements the 1-cycle policy. It adjusts the learning rate from an initial rate to a 

maximum, then decreases it to a much lower minimum.

- Custom training loop features:

    - Gradient accumulation

    - Automatic Mixed Precision 

    - Tracking of training/validation losses and scores

    - tracking of per-class scores

## Discussion

The model used is DeeplabV3, trained for 60 epochs with unscaled images (H720 x W1280), which resulted in a Dice Score of `0.79776` on the Kaggle competition Private Set.

See the [notebook](project-4-deep-learning-with-pytorch-2024.ipynb).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/manuelz/dlpt-semantic-segmentation

Awesome Lists containing this project

README