https://github.com/lucidrains/adam-atan2-pytorch

Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch
https://github.com/lucidrains/adam-atan2-pytorch

adam artificial-intelligence deep-learning optimizers stability

Last synced: over 1 year ago
JSON representation

Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch

Host: GitHub
URL: https://github.com/lucidrains/adam-atan2-pytorch
Owner: lucidrains
License: mit
Created: 2024-07-30T15:19:02.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-11-27T01:26:46.000Z (over 1 year ago)
Last Synced: 2025-03-31T09:07:26.326Z (over 1 year ago)
Topics: adam, artificial-intelligence, deep-learning, optimizers, stability
Language: Python
Homepage:
Size: 429 KB
Stars: 102
Watchers: 2
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          

## Adam-atan2 - Pytorch

Implementation of the proposed Adam-atan2 optimizer in Pytorch

A multi-million dollar paper out of google deepmind proposes a small change to Adam update rule (using `atan2`) to remove the epsilon altogether for numerical stability and scale invariance

It also contains some features for improving plasticity (continual learning field)

## Install

```bash

$ pip install adam-atan2-pytorch

```

## Usage

```python

import torch

from torch import nn

# toy model

model = nn.Linear(10, 1)

# import AdamAtan2 and instantiate with parameters

from adam_atan2_pytorch import AdamAtan2

opt = AdamAtan2(model.parameters(), lr = 1e-4)

# forward and backwards

for _ in range(100):

  loss = model(torch.randn(10))

  loss.backward()

  # optimizer step

  opt.step()

  opt.zero_grad()

```

## Citations

```bibtex

@inproceedings{Everett2024ScalingEA,

    title   = {Scaling Exponents Across Parameterizations and Optimizers},

    author  = {Katie Everett and Lechao Xiao and Mitchell Wortsman and Alex Alemi and Roman Novak and Peter J. Liu and Izzeddin Gur and Jascha Narain Sohl-Dickstein and Leslie Pack Kaelbling and Jaehoon Lee and Jeffrey Pennington},

    year    = {2024},

    url     = {https://api.semanticscholar.org/CorpusID:271051056}

}

```

```bibtex

@inproceedings{Kumar2023MaintainingPI,

    title   = {Maintaining Plasticity in Continual Learning via Regenerative Regularization},

    author  = {Saurabh Kumar and Henrik Marklund and Benjamin Van Roy},

    year    = {2023},

    url     = {https://api.semanticscholar.org/CorpusID:261076021}

}

```

```bibtex

@article{Lewandowski2024LearningCB,

    title   = {Learning Continually by Spectral Regularization},

    author  = {Alex Lewandowski and Saurabh Kumar and Dale Schuurmans and Andr'as Gyorgy and Marlos C. Machado},

    journal = {ArXiv},

    year    = {2024},

    volume  = {abs/2406.06811},

    url     = {https://api.semanticscholar.org/CorpusID:270380086}

}

```

```bibtex

@inproceedings{Taniguchi2024ADOPTMA,

    title   = {ADOPT: Modified Adam Can Converge with Any \$\beta\_2\$ with the Optimal Rate},

    author  = {Shohei Taniguchi and Keno Harada and Gouki Minegishi and Yuta Oshima and Seong Cheol Jeong and Go Nagahara and Tomoshi Iiyama and Masahiro Suzuki and Yusuke Iwasawa and Yutaka Matsuo},

    year    = {2024},

    url     = {https://api.semanticscholar.org/CorpusID:273822148}

}

```

```bibtex

@inproceedings{Liang2024CautiousOI,

    title   = {Cautious Optimizers: Improving Training with One Line of Code},

    author  = {Kaizhao Liang and Lizhang Chen and Bo Liu and Qiang Liu},

    year    = {2024},

    url     = {https://api.semanticscholar.org/CorpusID:274234738}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/lucidrains/adam-atan2-pytorch

Awesome Lists containing this project

README