{"id":20977041,"url":"https://github.com/githubharald/quasi_cauchy_optimizer","last_synced_at":"2026-05-06T13:04:14.675Z","repository":{"id":107433450,"uuid":"318534624","full_name":"githubharald/quasi_cauchy_optimizer","owner":"githubharald","description":"Implementation of the quasi Cauchy optimizer, an optimization method from the quasi Newton family. It uses a diagonal approximation of the Hessian and therefore has a small memory footprint. ","archived":false,"fork":false,"pushed_at":"2021-07-26T19:59:27.000Z","size":163,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-20T06:13:48.515Z","etag":null,"topics":["gradient-descent-algorithm","machine-learning","optimization-algorithms","quasi-newton"],"latest_commit_sha":null,"homepage":"https://githubharald.github.io/quasi_cauchy.html","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/githubharald.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-12-04T14:05:46.000Z","updated_at":"2023-02-22T12:34:43.000Z","dependencies_parsed_at":"2023-04-19T19:16:28.764Z","dependency_job_id":null,"html_url":"https://github.com/githubharald/quasi_cauchy_optimizer","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/githubharald%2Fquasi_cauchy_optimizer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/githubharald%2Fquasi_cauchy_optimizer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/githubharald%2Fquasi_cauchy_optimizer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/githubharald%2Fquasi_cauchy_optimizer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/githubharald","download_url":"https://codeload.github.com/githubharald/quasi_cauchy_optimizer/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243381495,"owners_count":20281978,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gradient-descent-algorithm","machine-learning","optimization-algorithms","quasi-newton"],"created_at":"2024-11-19T04:57:08.972Z","updated_at":"2025-12-28T13:54:29.709Z","avatar_url":"https://github.com/githubharald.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Quasi Cauchy Optimizer\n\nImplementation of the quasi Cauchy optimizer as described by [Zhu et al](http://www.math.uwaterloo.ca/~hwolkowi/henry/reports/cauchy.pdf).\nIt is a member of the quasi Newton family.\nThe Hessian is approximated by a diagonal matrix which satisfies the weak secant equation.\nThe method is memory-efficient, because a diagonal matrix has the same memory-footprint as a vector.\n\n## Installation\n\n* Go to the root level of the repository\n* Execute `pip install .`\n* Go to `tests/` and execute `pytest` to check if installation worked\n\n## Usage\n\nTo use the optimizer in your own code, define the function to be minimized, and the gradient of this function \n(or compute it using e.g. the autograd package).\nThen, call the `optimize(...)` function with an initial guess of the solution.\nThe result holds both the final iterate (attribute `x`) and the path from the initial to the final iterate (attribute `path`).\nHere is a small example that computes the minimum of a quadratic function:\n\n````python\nfrom quasi_cauchy_optimizer import optimize, UpdateRule\nimport numpy as np\n\n# function to minimize: 5 * x**2 + y**2\ndef func(x):\n    return 5 * x[0]**2 + x[1]**2\n\n# gradient of function: (10x, 2y)\ndef grad(x):\n    return np.asarray([10, 2]) * x\n\n# define start value\nx0 = np.asarray([1, 2])\n\n# run optimizer\nres = optimize(func, grad, x0, UpdateRule.DIAGONAL, grad_zero_tol=1e-5)\n\n# print result\nprint(res.x)\n````\n\nFunction arguments: \n* func: function to be minimized\n* grad: gradient of function to be minimized\n* x0: start value (initial guess)\n* update_rule\n    * UpdateRule.DIAGONAL: Hessian is approximated as diagonal matrix\n    * UpdateRule.SCALED_IDENTITY: Hessian is approximated as scaled identity matrix\n    * UpdateRule.IDENTITY: Hessian is approximated as identity matrix (that is vanilla gradient descent, included only to have a baseline for evaluation)\n* grad_zero_tol: if gradient norm is below this value, the algorithm terminates\n* eps: small value that it added to denominator to avoid division by 0\n* min_curv: Hessian values are clipped to [min_curv, max_curv]\n* max_curv: Hessian values are clipped to [min_curv, max_curv]\n* max_iter: maximum number of iterations\n* verbose: output internal state of algorithm\n\n\n## Examples\n\n* Install requirements: `pip install -r requirements.txt`\n* Go to `examples/`  \n* Run optimizer on common test functions:\n    * Fast version testing only two functions: `python common_test_functions.py fast`\n    * Test all functions: `python common_test_functions.py`\n* Run optimizer on logistic regression task: `python logistic_regression.py`\n\nExpected output for `python common_test_functions.py fast`:\n````\nFunction: beale\nbeale, DIAGONAL, err=0.000, iter=172\nbeale, SCALED_IDENTITY, err=0.000, iter=33\nbeale, IDENTITY, err=0.001, iter=367\n\nFunction: polyNd\npolyNd, DIAGONAL, err=0.371, iter=235\npolyNd, SCALED_IDENTITY, err=0.605, iter=501\npolyNd, IDENTITY, err=0.962, iter=501\n````\n\n![plot](doc/plot.png)\n\n\n## Some notes\nTo ensure having a descent direction, the Hessian simply is clipped, where the minimum value (min_curv) should be set to some small value larger than 0.\nA line-search is applied along the computed update-direction to get a reasonable step-size.\n\nThe diagonal approximation (UpdateRule.DIAGONAL) performs best for high-dimensional functions with scale varying across dimensions. \nOtherwise, the simple scaled identity approximation (UpdateRule.SCALED_IDENTITY) performs best. \nThis also includes the typical 2D test-functions like Rosenbrock.\n\nFor results and details on how the Hessian approximation is computed see [this article](https://githubharald.github.io/quasi_cauchy.html).","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgithubharald%2Fquasi_cauchy_optimizer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgithubharald%2Fquasi_cauchy_optimizer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgithubharald%2Fquasi_cauchy_optimizer/lists"}