Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/glambard/adamw_keras
AdamW optimizer for Keras
https://github.com/glambard/adamw_keras
adam adamw keras optimizer tensorflow
Last synced: 2 months ago
JSON representation
AdamW optimizer for Keras
- Host: GitHub
- URL: https://github.com/glambard/adamw_keras
- Owner: GLambard
- Created: 2018-07-03T08:52:21.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2019-08-09T01:51:41.000Z (over 5 years ago)
- Last Synced: 2023-10-20T08:47:46.861Z (over 1 year ago)
- Topics: adam, adamw, keras, optimizer, tensorflow
- Language: Python
- Size: 6.84 KB
- Stars: 113
- Watchers: 3
- Forks: 33
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# [Fixing Weight Decay Regularization in Adam](https://arxiv.org/abs/1711.05101) - For [Keras](https://keras.io/) :zap: :smiley:
Implementation of the [**AdamW optimizer**](https://arxiv.org/abs/1711.05101)(**Ilya Loshchilov, Frank Hutter**) for [Keras](https://keras.io/).
## Tested on this system
- python 3.6
- Keras 2.1.6
- tensorflow(-gpu) 1.8.0## Usage
Additionally to a usual Keras setup for neural nets building (see [Keras](https://keras.io/) for details)
```
from AdamW import AdamWadamw = AdamW(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0., weight_decay=0.025, batch_size=1, samples_per_epoch=1, epochs=1)
```
Then nothing change compared to the usual usage of an optimizer in Keras after the definition of a model's architecture
```
model = Sequential()model.compile(loss="mse", optimizer=adamw, metrics=[metrics.mse], ...)
```Note that the size of a batch (batch_size), number of training samples per epoch (samples_per_epoch) and the number of epochs (epochs) are necessary to the normalization of the weight decay ([paper](https://arxiv.org/abs/1711.05101), Section 4)
## Done
- Weight decay added to the parameters optimization
- Normalized weight decay added## To be done (eventually - help is welcome)
- Cosine annealing
- Warm restarts# Source
[**ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION**](https://arxiv.org/pdf/1412.6980v8.pdf), D.P. Kingma, J. Lei Ba
[**Fixing Weight Decay Regularization in Adam**](https://arxiv.org/pdf/1711.05101.pdf), I. Loshchilov, F. Hutter