https://github.com/evanatyourservice/apollo-tf

Apollo optimizer in tensorflow
https://github.com/evanatyourservice/apollo-tf

apollo optimizer tensorflow tensorflow2 tf tf2

Last synced: 7 months ago
JSON representation

Apollo optimizer in tensorflow

Host: GitHub
URL: https://github.com/evanatyourservice/apollo-tf
Owner: evanatyourservice
License: mit
Created: 2021-11-08T17:01:27.000Z (almost 4 years ago)
Default Branch: main
Last Pushed: 2021-11-09T16:42:08.000Z (almost 4 years ago)
Last Synced: 2025-01-27T06:44:41.531Z (8 months ago)
Topics: apollo, optimizer, tensorflow, tensorflow2, tf, tf2
Language: Python
Homepage:
Size: 5.86 KB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE.md

Awesome Lists containing this project

README

# Apollo Optimizer in Tensorflow 2.x

Unofficial implementation of https://arxiv.org/abs/2009.13586

Official implementation: https://github.com/XuezheMax/apollo

### Notes:

- Warmup is important with Apollo optimizer, so be sure to pass in a learning rate schedule vs. a constant learning
rate for `learning_rate`. One cycle scheduler is given as an example in one_cycle_lr_schedule.py
- To clip gradient norms as in paper, add either `clipnorm` (parameter-wise clipping by norm) or `global_clipnorm` to
the arguments (for example `clipnorm=0.1`).
- Decoupled weight decay is used by default.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/evanatyourservice/apollo-tf

Awesome Lists containing this project

README