https://github.com/shaoanlu/adamw-and-sgdw

keras implementation of AdamW from Fixing Weight Decay Regularization in Adam (https://arxiv.org/abs/1711.05101)
https://github.com/shaoanlu/adamw-and-sgdw

Last synced: 3 months ago
JSON representation

keras implementation of AdamW from Fixing Weight Decay Regularization in Adam (https://arxiv.org/abs/1711.05101)

Host: GitHub
URL: https://github.com/shaoanlu/adamw-and-sgdw
Owner: shaoanlu
Created: 2017-12-05T14:32:14.000Z (almost 8 years ago)
Default Branch: master
Last Pushed: 2018-07-23T12:06:41.000Z (about 7 years ago)
Last Synced: 2025-05-12T16:44:25.897Z (5 months ago)
Language: Jupyter Notebook
Homepage:
Size: 328 KB
Stars: 71
Watchers: 6
Forks: 9
Open Issues: 2
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # AdamW-and-SGDW

[Fixing Weight Decay Regularization in Adam](https://arxiv.org/abs/1711.05101)  

 Ilya Loshchilov, Frank Hutter

### [WIP Alert]

This repository is still work in progress.  

 The functionanity of AdamW and SGDW have not been fully checked. The implementation could be wrong.

## Usage

Please have a look at [demo_fashion_mnist.ipynb](https://github.com/shaoanlu/AdamW-and-SGDW/blob/master/demo_fashion_mnist.ipynb).

```python

from AdamW import AdamW

from SGDW import SGDW

# Suggested weight decay factor from the paper: w = w_norm * (b/B/T)**0.5

# b: batch size

# B: total number of training points per epoch

# T: total number of epochs

# w_norm: designed weight decay factor (w is the normalized one).

# weight_decay: float >= 0. The parameter for decoupled weight decay.

AdamW(lr=0.001, beta_1=0.9, beta_2=0.999, weight_decay=1e-4, epsilon=1e-8, decay=0.)

SGDW(lr=0.01, momentum=0., decay=0., weight_decay=1e-4, nesterov=False)

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/shaoanlu/adamw-and-sgdw

Awesome Lists containing this project

README