https://github.com/christiansassi/deep-learning-project

Project developed by Matteo Beltrami (@matteobeltrami), Pietro Bologna (@bolognapietro), and Christian Sassi for the Deep Learning course.
https://github.com/christiansassi/deep-learning-project

deep-learning memo

Last synced: 2 months ago
JSON representation

Project developed by Matteo Beltrami (@matteobeltrami), Pietro Bologna (@bolognapietro), and Christian Sassi for the Deep Learning course.

Host: GitHub
URL: https://github.com/christiansassi/deep-learning-project
Owner: christiansassi
Created: 2024-04-18T08:13:06.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-05-22T14:27:18.000Z (about 1 year ago)
Last Synced: 2024-05-23T08:57:23.748Z (about 1 year ago)
Topics: deep-learning, memo
Language: Jupyter Notebook
Homepage:
Size: 246 KB
Stars: 0
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # Deep Learning 2024 - Project Assignment



## Introduction

Deep neural networks often suffer from severe performance degradation when tested on images that differ visually from those encountered during training. This degradation is caused by factors such as domain shift, noise, or changes in lighting.

Recent research has focused on domain adaptation techniques to build deep models that can adapt from an annotated source dataset to a target dataset. However, such methods usually require access to downstream training data, which can be challenging to collect.

An alternative approach is **Test-Time Adaptation (TTA)**, which aims to improve the robustness of a pre-trained neural network to a test dataset, potentially by enhancing the network's predictions on one test sample at a time. Two notable TTA methods for image classification are:

- **[Marginal Entropy Minimization with One test point (MEMO)](https://arxiv.org/pdf/2110.09506)**: This method uses pre-trained models directly without making any assumptions about their specific training procedures or architectures, requiring only a single test input for adaptation.

- **[Test-Time Prompt Tuning (TPT)](https://arxiv.org/pdf/2209.07511)**: This method leverages pre-existing models without any assumptions about their specific training methods or architectures, enabling adaptation using only a small set of labeled examples from the target domain.

## MEMO

For this project, MEMO was applied to a pretrained Convolutional Neural Network, **ViT-b/16**, using the **ImageNetV2** dataset. This network operates as follows: given a test point $x \in X$, it produces a conditional output distribution $p(y|x; w)$ over a set of classes $Y$, and predicts a label $\hat{y}$ as:

$$ \hat{y} = M(x | w) = \arg \max_{y \in Y} p(y | x; w) $$



  

  


  Fig. 1 MEMO overview



Let $ A = \{a_1,...,a_M\} $ be a set of augmentations (resizing, cropping, color jittering etc...). Each augmentation $ a_i \in A $ can be applied to an input sample $x$, resulting in a transformed sample denoted as $a_i(x)$, as shown in figure. The objective here is to make the model's prediction invariant to those specific transformations.

MEMO starts by appling a set of $B$ augmentation functions sampled from $A$ to $x$. It then calculates the average, or marginal, output distribution $ \bar{p}(y | x; w) $ by averaging the conditional output distributions over these augmentations, represented as:

$$ \bar{p}(y | x; w) = \frac{1}{B} \sum_{i=1}^B p(y | a_i(x); w) $$

Since the true label $y$ is not available during testing, the objective of Test-Time Adaptation (TTA) is twofold: (i) to ensure that the model's predictions have the same label $y$ across various augmented versions of the test sample, (ii) to increase the confidence in the model's predictions, given that the augmented versions have the same label. To this end, the model is trained to minimize the entropy of the marginal output distribution across augmentations, defined as:

$$ L(w; x) = H(\bar{p}(\cdot | x;w)) = -\sum_{y \in Y} \bar{p}(y | x;w) \text{log} \bar{p}(y | x;w) $$

## How to Run

1. Clone the repository:

   ```bash

   git clone https://github.com/christiansassi/deep-learning-project

   cd deep_learning_project

   ```

2. Upload the notebook `deep_learning.ipynb` on [Google Colab](https://colab.research.google.com/). *NOTE: Make sure you use the T4 GPU*.

# Contacts

Matteo Beltrami - [[email protected]](mailto:[email protected])

Pietro Bologna - [[email protected]](mailto:[email protected])

Christian Sassi - [[email protected]](mailto:[email protected])

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/christiansassi/deep-learning-project

Awesome Lists containing this project

README