https://github.com/epfml/grad-norm-smooth
Official implementation of "Gradient-Normalized Smoothness for Optimization with Approximate Hessians"
https://github.com/epfml/grad-norm-smooth
inexact-hessians second-order-optimization
Last synced: 7 months ago
JSON representation
Official implementation of "Gradient-Normalized Smoothness for Optimization with Approximate Hessians"
- Host: GitHub
- URL: https://github.com/epfml/grad-norm-smooth
- Owner: epfml
- License: apache-2.0
- Created: 2025-06-16T17:49:23.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-06-17T08:12:38.000Z (9 months ago)
- Last Synced: 2025-06-17T09:30:10.685Z (9 months ago)
- Topics: inexact-hessians, second-order-optimization
- Language: Jupyter Notebook
- Homepage: https://arxiv.org/abs/2506.13710
- Size: 609 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Supplementary code for "Gradient-Normalized Smoothness for Optimization with Approximate Hessians"
[](https://arxiv.org/abs/2506.13710)
This code comes jointly with reference:
> Andrei Semenov, Martin Jaggi, Nikita Doikov.
Date: June 2025
## Structure
```sh
src/
methods.py # Algorithm 1 from the paper, algorithms with other adaptive search schemes, gradient methods
oracles.py # LogSumExp, Nonlinear Equations with linear operator and Chebyshev polynomials, Rosenbrock function, etc.
approximations.py # code for Hessian approximations for different oracles
utils.py # code for plotting graphs
data/
mushrooms.txt # example of a dataset; you can add here more
notebooks/
examples.ipynb # examples of approximations and comparison of methods
```
## Quickstart
Simply run the ```examples.ipynb``` notebook.
At the beginning of the notebook, we provide practical approximations for each oracle.
All of them are compatible with our theory.
In particular, we investigated the following approximations.
| Problem | Naming in the paper | Approximation | Code reference in ```src/approximations.py```|
|---------|------|--------|-----|
| LogSumExp | Weighted Gauss-Newton | $\frac{1}{\mu}\mathbf{A}^\top Diag\left(\mathrm{softmax}\left(\mathbf{A}, x\right)\right)\mathbf{A}$ | [```approx_hess_fn_logsumexp```](https://github.com/epfml/hess-approx/blob/0d294d9b65dc6bffb1434994abad8fba5a3aa7dd/src/approximations.py#L6) |
| Equations with linear operator | Fisher Term of $\mathbf{H}$ | $\frac{p-2}{\lVert u(x) \rVert^p} \nabla f(x) \nabla f(x)^\top$ | [```approx_hess_fn_fisher_term```](https://github.com/epfml/hess-approx/blob/0d294d9b65dc6bffb1434994abad8fba5a3aa7dd/src/approximations.py#L16) |
| Nonlinear Equations & Rosenbrock | Inexact Hessian | $\lVert u(x)\rVert^{p - 2} \nabla u(x)^\top \mathbf{B} \nabla u(x) + \frac{p - 2}{\lVert u(x) \rVert^p} \nabla f(x) \nabla f(x)^{\top}$ | [```approx_hess_nonlinear_equations```](https://github.com/epfml/hess-approx/blob/0d294d9b65dc6bffb1434994abad8fba5a3aa7dd/src/approximations.py#L31)|
|Nonlinear Equations & Chebyshev polynomials | Inexact Hessian | $\lVert u(x) \rVert^{p - 2} \nabla u(x)^\top \mathbf{B} \nabla u(x) + \frac{p - 2}{\lVert u(x) \rVert^p} \nabla f(x) \nabla f(x)^{\top}$ | [```approx_hess_fn_chebyshev```](https://github.com/epfml/hess-approx/blob/0d294d9b65dc6bffb1434994abad8fba5a3aa7dd/src/approximations.py#L51) |
You can also use a fast implementation of our algorithm, which corresponds to ```grad_norm_smooth_for_rank_one``` function in ```examples.py```.
Thus, you could obtain the following nice examples:
**We believe the details provided are clear enough to reproduce the main findings of our paper.**
```bib
@misc{semenov2025gradientnormalizedsmoothnessoptimizationapproximate,
title={Gradient-Normalized Smoothness for Optimization with Approximate Hessians},
author={Andrei Semenov and Martin Jaggi and Nikita Doikov},
year={2025},
eprint={2506.13710},
archivePrefix={arXiv},
primaryClass={math.OC},
url={https://arxiv.org/abs/2506.13710},
}
```