{"id":28209042,"url":"https://github.com/dscamiss/hesse","last_synced_at":"2025-06-12T09:30:59.537Z","repository":{"id":287093708,"uuid":"931796761","full_name":"dscamiss/hesse","owner":"dscamiss","description":"PyTorch tools for Hessian-related operations","archived":false,"fork":false,"pushed_at":"2025-04-11T18:42:13.000Z","size":152,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-17T15:14:01.774Z","etag":null,"topics":["artificial-intelligence","hessian","hessian-matrix","machine-learning","optimization","python","pytorch","second-order-optimization"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dscamiss.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-12T21:44:50.000Z","updated_at":"2025-04-11T18:42:16.000Z","dependencies_parsed_at":"2025-04-09T23:43:25.949Z","dependency_job_id":null,"html_url":"https://github.com/dscamiss/hesse","commit_stats":null,"previous_names":["dscamiss/hesse"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/dscamiss/hesse","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dscamiss%2Fhesse","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dscamiss%2Fhesse/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dscamiss%2Fhesse/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dscamiss%2Fhesse/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dscamiss","download_url":"https://codeload.github.com/dscamiss/hesse/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dscamiss%2Fhesse/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259438587,"owners_count":22857549,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","hessian","hessian-matrix","machine-learning","optimization","python","pytorch","second-order-optimization"],"created_at":"2025-05-17T15:13:20.908Z","updated_at":"2025-06-12T09:30:59.504Z","avatar_url":"https://github.com/dscamiss.png","language":"Python","readme":"# `hesse` :snake:\n\n![License](https://img.shields.io/badge/license-MIT-blue)\n![PyTorch](https://img.shields.io/badge/PyTorch-%23EE4C2C.svg?logo=PyTorch\u0026logoColor=white)\n![Python](https://img.shields.io/badge/python-3.9-blue.svg)\n![Python](https://img.shields.io/badge/python-3.10-blue.svg)\n![Python](https://img.shields.io/badge/python-3.11-blue.svg)\n![Build](https://github.com/dscamiss/hesse/actions/workflows/python-package.yml/badge.svg)\n[![codecov](https://codecov.io/gh/dscamiss/hesse/graph/badge.svg?token=Z3CGGZJ70B)](https://codecov.io/gh/dscamiss/hesse)\n\n# Introduction\n\nThe goal of this package is to simplify the computation of certain Hessian matrices.  \n\nIn particular, suppose that we are interested in computing the Hessian matrix of `model` with respect to \nits parameters.  The existing paradigm is to make a \"functional version\" of `model`'s forward pass\n\n```python\ndef functional_forward(params):\n    return torch.func.functional_call(model, params, inputs)\n```\n\nand then compute its Hessian\n\n```python\nparams = dict(model.named_parameters())\nhessian = torch.func.hessian(functional_forward)(params)\n```\n\nThe output `hessian` is a dictionary of dictionaries, such that `hessian[\"P\"][\"Q\"]` is the Hessian matrix block \ncorreponding to named parameters `P` and `Q`.  Extra work is required if we want to assemble the full Hessian matrix, \nif we want to modify this process to obtain a diagonal approximation of the full Hessian matrix, and so on.\n\nThis package aims to remove the extra work, by providing user-friendly wrappers for `torch.func` transforms\nand matrix assembly.\n\n# Installation\n\nIn an existing Python 3.9+ environment:\n\n```bash\ngit clone https://github.com/dscamiss/hesse/\npip install ./hesse\n```\n\n# Examples\n\n## Setup\n\nImport packages.\n\n```python\nimport hesse\nimport torch\n```\n\nCreate a toy multi-input, multi-output model.\n\n```python\nclass MimoModel(torch.nn.Module):\n    \"\"\"Multi-input, multi-output demo model.\"\"\"\n\n    def __init__(self, m: int) -\u003e None:\n        super().__init__()\n        self.A = torch.nn.Parameter(torch.randn(m, m))\n        self.B = torch.nn.Parameter(torch.randn(m, m))\n\n    def forward(self, x, y):\n        \"\"\"\n        Run forward pass.\n\n        Args:\n            x: First input tensor of shape (b, n).\n            y: Second input tensor of shape (b, n).\n\n        Returns:\n            The matrix\n\n            [ tr(A^t A) x_{    0, :}   tr(B^t B) y_{    0, :} ]\n            [ tr(A^t A) x_{    1, :}   tr(B^t B) y_{    1, :} ]\n            [           :                        :            ]\n            [ tr(A^t A) x_{m - 1, :}   tr(B^t B) y_{m - 1, :} ].\n        \"\"\"\n        row_1 = torch.trace(self.A.T @ self.A) * x\n        row_2 = torch.trace(self.B.T @ self.B) * y\n        return torch.hstack((row_1, row_2))\n```\n\nMake an instance of `MimoModel` and batch inputs.\n\n```python\nmodel = MimoModel(2)\n\nx = torch.Tensor(\n    [\n        [1.0, 2.0],\n        [3.0, 4.0],\n    ]\n)\ny = -1.0 * x\n```\n\n## Model Hessians\n\nComputing the Hessian matrix of `model` is easy:\n\n```python\nhessian = hesse.model_hessian_matrix(model=model, inputs=(x, y))\n```\n\nGenerally speaking, the shape of the Hessian matrix will be `(batch_size, output_size, ...)`.  In this instance, `batch_size = 2` and `output_size = 4`, \nso that `hessian` has shape `(2, 4, 8, 8)`.\n\nTo compute the Hessian matrix with respect to a subset of the model parameters, just provide a list of parameter names:\n\n```python\nhessian = hesse.model_hessian_matrix(model=model, inputs=(x, y), params=[\"A\"])\n```\n\n## Loss function Hessians\n\nCreate a loss criterion and batch target output:\n\n```python\ncriterion = torch.nn.MSELoss()\ntarget = torch.randn(2, 4)\n```\n\nComputing the Hessian matrix of the loss function `criterion(model(inputs), target)` is easy:\n\n```python\nloss_hessian = hesse.loss_hessian_matrix(\n    model=model,\n    criterion=criterion,\n    inputs=(x, y),\n    target=target,\n)\n```\n\nAs above, we can compute the Hessian matrix with respect to a subset of the model parameters:\n\n```python\nloss_hessian = hesse.loss_hessian_matrix(\n    model=model,\n    criterion=criterion,\n    inputs=(x, y),\n    target=target,\n    params=[\"A\"],\n)\n```\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdscamiss%2Fhesse","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdscamiss%2Fhesse","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdscamiss%2Fhesse/lists"}