https://github.com/titu1994/keras-padam
Keras implementation of Padam from "Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks"
https://github.com/titu1994/keras-padam
Last synced: 6 months ago
JSON representation
Keras implementation of Padam from "Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks"
- Host: GitHub
- URL: https://github.com/titu1994/keras-padam
- Owner: titu1994
- License: mit
- Created: 2018-09-04T16:41:19.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2018-09-06T23:44:46.000Z (about 7 years ago)
- Last Synced: 2025-03-25T05:34:09.835Z (7 months ago)
- Language: Python
- Size: 4.84 MB
- Stars: 17
- Watchers: 4
- Forks: 5
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Partially adaptive momentum estimation method for Keras
Keras implementation of Padam from [Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks](https://arxiv.org/abs/1806.06763).Padam allows for much larger learning rates to be utilized, and follows generalization closely with Stochastc Gradient Descent.
# Usage
Add the `padam.py` script and import `Padam`. Apart from the other parameters which are obtained from Adam, Padam has an additional parameter - `partial`, which should be modified to lie in the range [0, 0.5].```python
from padam import Padamoptimizer = Padam(lr=0.1, partial=0.125)
```
# Requirements
- Keras 2.2.0+
- Tensorflow / Theano / CNTK (Tensorflow tested)