https://github.com/lucidrains/adan-pytorch
Implementation of the Adan (ADAptive Nesterov momentum algorithm) Optimizer in Pytorch
https://github.com/lucidrains/adan-pytorch
artificial-intelligence deep-learning optimizer
Last synced: about 1 year ago
JSON representation
Implementation of the Adan (ADAptive Nesterov momentum algorithm) Optimizer in Pytorch
- Host: GitHub
- URL: https://github.com/lucidrains/adan-pytorch
- Owner: lucidrains
- License: mit
- Created: 2022-08-25T04:00:22.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2022-09-01T15:38:45.000Z (almost 4 years ago)
- Last Synced: 2025-03-29T12:09:06.741Z (about 1 year ago)
- Topics: artificial-intelligence, deep-learning, optimizer
- Language: Python
- Homepage:
- Size: 132 KB
- Stars: 251
- Watchers: 11
- Forks: 9
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README

## Adan - Pytorch
Implementation of the Adan (ADAptive Nesterov momentum algorithm) Optimizer in Pytorch.
Explanation from Davis Blalock
## Install
```bash
$ pip install adan-pytorch
```
## Usage
```python
from adan_pytorch import Adan
# mock model
import torch
from torch import nn
model = torch.nn.Sequential(
nn.Linear(16, 16),
nn.GELU()
)
# instantiate Adan with model parameters
optim = Adan(
model.parameters(),
lr = 1e-3, # learning rate (can be much higher than Adam, up to 5-10x)
betas = (0.02, 0.08, 0.01), # beta 1-2-3 as described in paper - author says most sensitive to beta3 tuning
weight_decay = 0.02 # weight decay 0.02 is optimal per author
)
# train
for _ in range(10):
loss = model(torch.randn(16)).sum()
loss.backward()
optim.step()
optim.zero_grad()
```
## Citations
```bibtex
@article{Xie2022AdanAN,
title = {Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models},
author = {Xingyu Xie and Pan Zhou and Huan Li and Zhouchen Lin and Shuicheng Yan},
journal = {ArXiv},
year = {2022},
volume = {abs/2208.06677}
}
```