{"id":27162873,"url":"https://github.com/notedance/adan","last_synced_at":"2025-04-10T03:28:51.851Z","repository":{"id":286828352,"uuid":"962695352","full_name":"NoteDance/Adan","owner":"NoteDance","description":"TensorFlow implementation for Adan optimizer","archived":false,"fork":false,"pushed_at":"2025-04-08T14:37:03.000Z","size":11,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-08T15:37:08.853Z","etag":null,"topics":["deep-learning","deep-reinforcement-learning","keras","machine-learning","optimizer","reinforcement-learning","tensorflow"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NoteDance.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-04-08T14:35:37.000Z","updated_at":"2025-04-08T14:48:42.000Z","dependencies_parsed_at":"2025-04-08T15:41:03.712Z","dependency_job_id":"1de7959c-9e52-446e-8bda-e939e38fad09","html_url":"https://github.com/NoteDance/Adan","commit_stats":null,"previous_names":["notedance/adan"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NoteDance%2FAdan","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NoteDance%2FAdan/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NoteDance%2FAdan/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NoteDance%2FAdan/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NoteDance","download_url":"https://codeload.github.com/NoteDance/Adan/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247957878,"owners_count":21024774,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","deep-reinforcement-learning","keras","machine-learning","optimizer","reinforcement-learning","tensorflow"],"created_at":"2025-04-09T01:34:15.536Z","updated_at":"2025-04-09T01:34:16.278Z","avatar_url":"https://github.com/NoteDance.png","language":"Python","readme":"# Adan\n\n**Overview**:\n\nThe **Adan (Adaptive Nesterov Momentum)** optimizer is a next-generation optimization algorithm designed to accelerate training and improve convergence in deep learning models. It combines **adaptive gradient estimation** and **multi-step momentum** for enhanced performance.\n\nThis algorithm is introduced in the paper:\n- **\"Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models\"** ([arXiv link](https://arxiv.org/abs/2208.06677)).\n\nThe implementation is inspired by the official repository:\n- [Adan GitHub Repository](https://github.com/sail-sg/Adan)\n\n**Parameters**:\n\n- **`learning_rate`** *(float, default=1e-3)*: Learning rate for the optimizer.\n- **`beta1`** *(float, default=0.98)*: Exponential decay rate for the first moment estimates.\n- **`beta2`** *(float, default=0.92)*: Exponential decay rate for gradient difference momentum.\n- **`beta3`** *(float, default=0.99)*: Exponential decay rate for the second moment estimates.\n- **`epsilon`** *(float, default=1e-8)*: Small constant for numerical stability.\n- **`weight_decay`** *(float, default=0.0)*: Strength of weight decay regularization.\n- **`no_prox`** *(bool, default=False)*: If `True`, disables proximal updates during weight decay.\n- **`foreach`** *(bool, default=True)*: Enables multi-tensor operations for optimization.\n- **`clipnorm`** *(float, optional)*: Clips gradients by their norm.\n- **`clipvalue`** *(float, optional)*: Clips gradients by their value.\n- **`global_clipnorm`** *(float, optional)*: Clips gradients by their global norm.\n- **`use_ema`** *(bool, default=False)*: Enables Exponential Moving Average (EMA) for model parameters.\n- **`ema_momentum`** *(float, default=0.99)*: EMA momentum for parameter averaging.\n- **`ema_overwrite_frequency`** *(int, optional)*: Frequency for overwriting model parameters with EMA values.\n- **`loss_scale_factor`** *(float, optional)*: Scaling factor for loss values in mixed-precision training.\n- **`gradient_accumulation_steps`** *(int, optional)*: Number of steps for gradient accumulation.\n- **`name`** *(str, default=\"adan\")*: Name of the optimizer.\n\n---\n\n**Example Usage**:\n\n```python\nimport tensorflow as tf\n\n# Initialize the Adan optimizer\noptimizer = Adan(\n    learning_rate=1e-3,\n    beta1=0.98,\n    beta2=0.92,\n    beta3=0.99,\n    weight_decay=0.01,\n    use_ema=True,\n    ema_momentum=0.999\n)\n\n# Compile a model\nmodel.compile(\n    optimizer=optimizer,\n    loss=\"sparse_categorical_crossentropy\",\n    metrics=[\"accuracy\"]\n)\n\n# Train the model\nmodel.fit(train_dataset, validation_data=val_dataset, epochs=10)\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnotedance%2Fadan","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnotedance%2Fadan","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnotedance%2Fadan/lists"}