https://github.com/titu1994/keras-lamb-optimizer

Implementation of the LAMB optimizer for Keras from the paper "Reducing BERT Pre-Training Time from 3 Days to 76 Minutes"
https://github.com/titu1994/keras-lamb-optimizer

Last synced: 2 months ago
JSON representation

Implementation of the LAMB optimizer for Keras from the paper "Reducing BERT Pre-Training Time from 3 Days to 76 Minutes"

Host: GitHub
URL: https://github.com/titu1994/keras-lamb-optimizer
Owner: titu1994
License: mit
Created: 2019-04-04T04:06:28.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2019-04-05T02:35:49.000Z (over 6 years ago)
Last Synced: 2025-04-04T04:41:12.070Z (6 months ago)
Language: Python
Size: 13.7 KB
Stars: 75
Watchers: 4
Forks: 6
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Keras LAMB Optimizer (Layer-wise Adaptive Moments optimizer for Batch training)

-----

Implementation of the LAMB optimizer from the paper [Reducing BERT Pre-Training Time from 3 Days to 76 Minutes](https://arxiv.org/abs/1904.00962).

Supports large batch training of upto 64k while only using the learning rate as a hyper parameter. Also supports smaller batch sizes without any change in other hyper parameters.

# Usage

```python

from keras_lamb import LAMBOptimizer

optimizer = LAMBOptimizer(0.001, weight_decay=0.01)

model.compile(optimizer, ...)

```

# Requirements

- Keras 2.2.4+

- Tensorflow 1.13+

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/titu1994/keras-lamb-optimizer

Awesome Lists containing this project

README