https://github.com/vfdev-5/cifar10-faster

[WIP] Demonstration of training a small ResNet on CIFAR10 to 94% test accuracy in less 20 epochs
https://github.com/vfdev-5/cifar10-faster

Last synced: about 1 year ago
JSON representation

[WIP] Demonstration of training a small ResNet on CIFAR10 to 94% test accuracy in less 20 epochs

Host: GitHub
URL: https://github.com/vfdev-5/cifar10-faster
Owner: vfdev-5
Created: 2019-02-14T15:15:51.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2019-02-20T16:20:53.000Z (over 7 years ago)
Last Synced: 2025-03-31T12:05:02.357Z (about 1 year ago)
Language: Python
Size: 44.9 KB
Stars: 9
Watchers: 3
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Faster training on CIFAR10

Demonstration of training a small ResNet on CIFAR10 to 94% test accuracy in the minimal possible time and under 20 epochs.

This code is inspired by [cifar10-fast repository](https://github.com/davidcpage/cifar10-fast) and some of
the current code is adapted from the repository. In his [blog articles](https://www.myrtle.ai/2018/09/24/how_to_train_your_resnet/)
David Page (@davidcpage) explains the choice of the model and the way to optimize the dataflow.

## Project on Polyaxon platform

### Create project
```
polyaxon project create --name=cifar10-faster --description="Faster training on CIFAR10"
```

### Initialize project
```
polyaxon init cifar10-faster
```

## Reproduce original result

Train `fast-resnet` during 24 epochs using cutout data augmentation, SGD optimizer, piecewise linear scheduling:

```
polyaxon run -u -f plx_configs/fastresnet/xp_original_training.yaml --name=xp_original_training --tags=original
```

## Check other configurations around `fast-resnet`

### [Mixup](https://arxiv.org/abs/1710.09412)

We remove cutout data augmentation and uses mixup technics:
```
polyaxon run -u -f plx_configs/fastresnet/xp_training_mixup.yaml --name=xp_training_mixup --tags=original,mixup
```

### [AdamW](https://arxiv.org/pdf/1711.05101.pdf)

We uses decoupled weight decay Adam optimizer instead of SGD
```
polyaxon run -u -f plx_configs/fastresnet/xp_training_adamw.yaml --name=xp_training_adamw--tags=original,adamw
```

## Run hyperparameter tuning

```
polyaxon run -u -f plx_configs/fastresnet/gp_hp_bo_training.yaml --name=gp_hp_bo_training --tags=lt_20
```
or on WRN model
```
polyaxon run -u -f plx_configs/wrn/gp_hp_bo_training_wrn.yaml --name=gp_hp_bo_training_wrn --tags=lt_20,wrn
```

## Experiments on Google Colab

https://colab.research.google.com/drive/1W1_WEtatzyn32aPSrp4t5n66PuHQW6W8

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/vfdev-5/cifar10-faster

Awesome Lists containing this project

README