https://github.com/kimrass/vit

PyTorch implementation of 'ViT' (Dosovitskiy et al., 2020) and training it on CIFAR-10 and CIFAR-100
https://github.com/kimrass/vit

cifar10 cifar100 cutmix dropblock hide-and-seek imagenet1k mixup vision-transformer

Last synced: 8 months ago
JSON representation

PyTorch implementation of 'ViT' (Dosovitskiy et al., 2020) and training it on CIFAR-10 and CIFAR-100

Host: GitHub
URL: https://github.com/kimrass/vit
Owner: KimRass
Created: 2023-08-15T11:41:59.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2024-05-02T08:54:44.000Z (over 1 year ago)
Last Synced: 2024-05-02T22:03:57.094Z (over 1 year ago)
Topics: cifar10, cifar100, cutmix, dropblock, hide-and-seek, imagenet1k, mixup, vision-transformer
Language: Python
Homepage:
Size: 39.9 MB
Stars: 2
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # 1. Pre-trained Models

```python

DROP_PROB = 0.1

N_LAYERS = 6

HIDDEN_SIZE = 384

MLP_SIZE = 384

N_HEADS = 12

PATCH_SIZE = 4

BASE_LR = 1e-3

BETA1 = 0.9

BETA2 = 0.999

WEIGHT_DECAY = 5e-5

WARMUP_EPOCHS = 5

SMOOTHING = 0.1

CUTMIX = False

CUTOUT = False

HIDE_AND_SEEK = False

BATCH_SIZE = 2048

N_EPOCHS = 300

```

## 1) Trained on CIFAR-10 Dataset for 300 Epochs

- [vit_cifar10.pth](https://drive.google.com/file/d/1NkMB-WIDIwLIs-DvIxI39-K4TgQFq-nL/view?usp=sharing)

- Top-1 accuracy 0.864 on validation set

## 2) Trained on CIFAR-100 Dataset for 256 Epochs

- [vit_cifar100.pth](https://drive.google.com/file/d/1vxH9c1q2BbHiFRN8JSlu3zj7ZBPvQYR8/view?usp=sharing)

- Top-1 accuracy 0.447 on validation set

# 2. Implementation Details

- `F.gelu()` → `nn.Dropout()`의 순서가 되도록 Architecture를 변경했습니다. 순서를 반대로 할 경우 미분 값이 0이 되어 학습이 이루어지지 않는 현상이 발생함을 확인했습니다.

- CIFAR-100에 대해서 `N_LAYERS = 6, HIDDEN_SIZE = 384, N_HEADS = 6`일 때, `PATCH_SIZE = 16`일 때보다 `PATCH_SIZE = 8`일 때, 그리고 `PATCH_SIZE = 4`일 때 성능이 향상됐습니다.

- CIFAR-10과 CIFAR-100에 대해서 공통적으로 ViT-Base보다 작은 크기의 모델을 사용할 때 성능이 더 높았습니다.

# 3. Studies

## 1) Attention Map

- Original image

    - 

- head_fusion: "max", discard_ratio: 0.85

    - 

## 2) Position Embedding Similarity

-

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kimrass/vit

Awesome Lists containing this project

README