https://github.com/lonepatient/novograd-pytorch

pytorch implement of NovoGrad Optimizer
https://github.com/lonepatient/novograd-pytorch

adam adamw alexnet cifar10 novograd optimizer pytorch

Last synced: 6 months ago
JSON representation

pytorch implement of NovoGrad Optimizer

Host: GitHub
URL: https://github.com/lonepatient/novograd-pytorch
Owner: lonePatient
License: mit
Created: 2019-08-25T16:01:47.000Z (about 6 years ago)
Default Branch: master
Last Pushed: 2022-03-08T10:35:44.000Z (over 3 years ago)
Last Synced: 2025-03-24T08:21:22.875Z (6 months ago)
Topics: adam, adamw, alexnet, cifar10, novograd, optimizer, pytorch
Language: Python
Size: 215 KB
Stars: 18
Watchers: 2
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          ## NovoGrad Pytorch

This repository contains a PyTorch implementation of the NovoGrad Optimizer from the paper 

[Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks](https://arxiv.org/abs/1905.11286)

by Boris Ginsburg, Patrice Castonguay......

## Summarize

NovoGrad is a first-order SGD method with gradients normalized per layer. Borrowingfrom ND-Adam, NovoGrad uses the 2nd moment  for normalization and decouples weight decayfrom stochastic gradient for regularization as in AdamW.  NovoGrad has half the memoryconsumption compared to Adam (similar to AdaFactor, but with a simpler moment computation).Unlike AdaFactor, NovoGrad does not require learning rate warmup.

![](./png/novograd.png)

## Dependencies

* PyTorch

* torchvision

* matplotlib

## Usage

The code in this repository implements both NovoGrad and Adam training, with examples on the CIFAR-10 datasets.

Add the `optimizer.py` script to your project, and import it.

To use NovoGrad use the following command.

```python

from optimizer import NovoGrad

optimizer = NovoGrad(model.parameters(), lr=0.01,betas=(0.95, 0.98),weight_decay=0.001)

```

## Example

To produce the result,we use CIFAR-10 dataset for alexnet.

```python

# use adam

python run.py --optimizer-adam --model=alexnet

# use novograd

python run.py --optimizer=novograd --model=alexnet

# use adamW

python run.py --optimizer=adamw --model=alexnet

# use lr scheduler

python run.py --optimizer=adam --model=alexnet --do_scheduler

python run.py --optimizer=novograd --model=alexnet --do_scheduler

python run.py --optimizer=adamw --model=alexnet --do_scheduler

```

## Alexnet Results

Train loss of adam ,adamw and  novograd with Alexnet on CIFAR-10.

![](./png/alexnet_loss.png)

Valid loss of adam ,adamw and novograd with Alexnet on CIFAR-10.

![](./png/alexnet_valid_loss.png)

Valid accuracy of adam ,adamw and novograd with Alexnet on CIFAR-10.

![](./png/alexnet_valid_acc.png)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/lonepatient/novograd-pytorch

Awesome Lists containing this project

README