An open API service indexing awesome lists of open source software.

https://github.com/evanatyourservice/apollo-tf

Apollo optimizer in tensorflow
https://github.com/evanatyourservice/apollo-tf

apollo optimizer tensorflow tensorflow2 tf tf2

Last synced: 7 months ago
JSON representation

Apollo optimizer in tensorflow

Awesome Lists containing this project

README

          

# Apollo Optimizer in Tensorflow 2.x

Unofficial implementation of https://arxiv.org/abs/2009.13586

Official implementation: https://github.com/XuezheMax/apollo

### Notes:

- Warmup is important with Apollo optimizer, so be sure to pass in a learning rate schedule vs. a constant learning
rate for `learning_rate`. One cycle scheduler is given as an example in one_cycle_lr_schedule.py
- To clip gradient norms as in paper, add either `clipnorm` (parameter-wise clipping by norm) or `global_clipnorm` to
the arguments (for example `clipnorm=0.1`).
- Decoupled weight decay is used by default.