https://github.com/glambard/adamw_keras

AdamW optimizer for Keras
https://github.com/glambard/adamw_keras

adam adamw keras optimizer tensorflow

Last synced: 3 months ago
JSON representation

AdamW optimizer for Keras

Host: GitHub
URL: https://github.com/glambard/adamw_keras
Owner: GLambard
Created: 2018-07-03T08:52:21.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2019-08-09T01:51:41.000Z (about 6 years ago)
Last Synced: 2025-04-05T13:51:26.988Z (6 months ago)
Topics: adam, adamw, keras, optimizer, tensorflow
Language: Python
Size: 6.84 KB
Stars: 115
Watchers: 3
Forks: 33
Open Issues: 1
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # [Fixing Weight Decay Regularization in Adam](https://arxiv.org/abs/1711.05101) - For [Keras](https://keras.io/) :zap: :smiley:

Implementation of the [**AdamW optimizer**](https://arxiv.org/abs/1711.05101)(**Ilya Loshchilov, Frank Hutter**) for [Keras](https://keras.io/). 

## Tested on this system

- python 3.6

- Keras 2.1.6

- tensorflow(-gpu) 1.8.0

## Usage

Additionally to a usual Keras setup for neural nets building (see [Keras](https://keras.io/) for details)

```

from AdamW import AdamW

adamw = AdamW(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0., weight_decay=0.025, batch_size=1, samples_per_epoch=1, epochs=1)

```

Then nothing change compared to the usual usage of an optimizer in Keras after the definition of a model's architecture

```

model = Sequential()

model.compile(loss="mse", optimizer=adamw, metrics=[metrics.mse], ...)

```

Note that the size of a batch (batch_size), number of training samples per epoch (samples_per_epoch) and the number of epochs (epochs) are necessary to the normalization of the weight decay ([paper](https://arxiv.org/abs/1711.05101), Section 4)

## Done 

- Weight decay added to the parameters optimization

- Normalized weight decay added 

## To be done (eventually - help is welcome)

- Cosine annealing

- Warm restarts

# Source

[**ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION**](https://arxiv.org/pdf/1412.6980v8.pdf), D.P. Kingma, J. Lei Ba

[**Fixing Weight Decay Regularization in Adam**](https://arxiv.org/pdf/1711.05101.pdf), I. Loshchilov, F. Hutter

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/glambard/adamw_keras

Awesome Lists containing this project

README