{"id":13737936,"url":"https://github.com/archinetai/surgeon-pytorch","last_synced_at":"2025-04-04T14:03:30.339Z","repository":{"id":37394277,"uuid":"483844432","full_name":"archinetai/surgeon-pytorch","owner":"archinetai","description":"A library to inspect and extract intermediate layers of PyTorch models.","archived":false,"fork":false,"pushed_at":"2022-05-12T21:55:03.000Z","size":192,"stargazers_count":472,"open_issues_count":2,"forks_count":16,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-03-28T13:04:39.548Z","etag":null,"topics":["artificial-intelligence","deep-learning","pytorch"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/archinetai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-04-20T23:32:42.000Z","updated_at":"2025-02-27T08:05:44.000Z","dependencies_parsed_at":"2022-09-04T16:20:41.524Z","dependency_job_id":null,"html_url":"https://github.com/archinetai/surgeon-pytorch","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/archinetai%2Fsurgeon-pytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/archinetai%2Fsurgeon-pytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/archinetai%2Fsurgeon-pytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/archinetai%2Fsurgeon-pytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/archinetai","download_url":"https://codeload.github.com/archinetai/surgeon-pytorch/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247184831,"owners_count":20897842,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","deep-learning","pytorch"],"created_at":"2024-08-03T03:02:06.427Z","updated_at":"2025-04-04T14:03:30.320Z","avatar_url":"https://github.com/archinetai.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"\u003cimg src=\"./LOGO.png\"\u003e\u003c/img\u003e\n\nA library to inspect and extract intermediate layers of PyTorch models.\n\n### Why?\nIt's often the case that we want to _inspect_ intermediate layers of PyTorch models without modifying the code. This can be useful to get attention matrices of language models, visualize layer embeddings, or apply a loss function to intermediate layers. Sometimes we want _extract_ subparts of the model and run them independently, either to debug them or to train them separately. All of this can be done with Surgeon without changing one line of the original model.\n\n## Install\n\n```bash\n$ pip install surgeon-pytorch\n```\n\n[![PyPI - Python Version](https://img.shields.io/pypi/v/surgeon-pytorch?style=flat\u0026colorA=0f0f0f\u0026colorB=0f0f0f)](https://pypi.org/project/surgeon-pytorch/)\n\n## Usage\n\n### Inspect\n\nGiven a PyTorch model we can display all layers using `get_layers`:\n\n```python\nimport torch\nimport torch.nn as nn\n\nfrom surgeon_pytorch import Inspect, get_layers\n\nclass SomeModel(nn.Module):\n\n    def __init__(self):\n        super().__init__()\n        self.layer1 = nn.Linear(5, 3)\n        self.layer2 = nn.Linear(3, 2)\n        self.layer3 = nn.Linear(2, 1)\n\n    def forward(self, x):\n        x1 = self.layer1(x)\n        x2 = self.layer2(x1)\n        y = self.layer3(x2)\n        return y\n\n\nmodel = SomeModel()\nprint(get_layers(model)) # ['layer1', 'layer2', 'layer3']\n```\n\nThen we can wrap our `model` to be inspected using `Inspect` and in every forward call the new model we will also output the provided layer outputs (in second return value):\n\n```python\nmodel_wrapped = Inspect(model, layer='layer2')\nx = torch.rand(1, 5)\ny, x2 = model_wrapped(x)\nprint(x2) # tensor([[-0.2726,  0.0910]], grad_fn=\u003cAddmmBackward0\u003e)\n```\n\n\u003cdetails\u003e\n\u003csummary\u003e \u003cb\u003e Inspect Multiple Layers \u003c/b\u003e \u003c/summary\u003e\n\u003cbr\u003e\n\nWe can provide a list of layers:\n\n```python\nmodel_wrapped = Inspect(model, layer=['layer1', 'layer2'])\nx = torch.rand(1, 5)\ny, [x1, x2] = model_wrapped(x)\nprint(x1) # tensor([[ 0.1739,  0.3844, -0.4724]], grad_fn=\u003cAddmmBackward0\u003e)\nprint(x2) # tensor([[-0.2238,  0.0107]], grad_fn=\u003cAddmmBackward0\u003e)\n```\n\u003c/details\u003e\n     \n\u003cdetails\u003e\n\u003csummary\u003e \u003cb\u003e Name Inspected Layer Outputs \u003c/b\u003e \u003c/summary\u003e\n\u003cbr\u003e\n\nWe can provide a dictionary to get named outputs:\n```python\nmodel_wrapped = Inspect(model, layer={'layer1': 'x1', 'layer2': 'x2'})\nx = torch.rand(1, 5)\ny, layers = model_wrapped(x)\nprint(layers)\n\"\"\"\n{\n    'x1': tensor([[ 0.3707,  0.6584, -0.2970]], grad_fn=\u003cAddmmBackward0\u003e),\n    'x2': tensor([[-0.1953, -0.3408]], grad_fn=\u003cAddmmBackward0\u003e)\n}\n\"\"\"\n```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e \u003cb\u003e API \u003c/b\u003e \u003c/summary\u003e\n\u003cbr\u003e\n    \n```python\nmodel = Inspect(\n    model: nn.Module,\n    layer: Union[str, Sequence[str], Dict[str, str]],\n    keep_output: bool = True,\n)\n```\n    \n\u003c/details\u003e\n\n\n### Extract\n\nGiven a PyTorch model we can display all intermediate nodes of the graph using `get_nodes`:\n\n```python\nimport torch\nimport torch.nn as nn\nfrom surgeon_pytorch import Extract, get_nodes\n\nclass SomeModel(nn.Module):\n\n    def __init__(self):\n        super().__init__()\n        self.layer1 = nn.Linear(5, 3)\n        self.layer2 = nn.Linear(3, 2)\n        self.layer3 = nn.Linear(1, 1)\n\n    def forward(self, x):\n        x1 = torch.relu(self.layer1(x))\n        x2 = torch.sigmoid(self.layer2(x1))\n        y = self.layer3(x2).tanh()\n        return y\n\nmodel = SomeModel()\nprint(get_nodes(model)) # ['x', 'layer1', 'relu', 'layer2', 'sigmoid', 'layer3', 'tanh']\n```\n\nThen we can extract outputs using `Extract`, which will create a new model that returns the requested output node:\n\n```python\nmodel_ext = Extract(model, node_out='sigmoid')\nx = torch.rand(1, 5)\nsigmoid = model_ext(x)\nprint(sigmoid) # tensor([[0.5570, 0.3652]], grad_fn=\u003cSigmoidBackward0\u003e)\n```\n\nWe can also extract a model with new input nodes:\n\n```python\nmodel_ext = Extract(model, node_in='layer1', node_out='sigmoid')\nlayer1 = torch.rand(1, 3)\nsigmoid = model_ext(layer1)\nprint(sigmoid) # tensor([[0.5444, 0.3965]], grad_fn=\u003cSigmoidBackward0\u003e)\n```\n\n\u003cdetails\u003e\n\u003csummary\u003e \u003cb\u003e Multiple Nodes \u003c/b\u003e \u003c/summary\u003e\n\u003cbr\u003e    \n    \nWe can also provide multiple inputs and outputs and name them:\n\n```python\nmodel_ext = Extract(model, node_in={ 'layer1': 'x' }, node_out={ 'sigmoid': 'y1', 'relu': 'y2'})\nout = model_ext(x = torch.rand(1, 3))\nprint(out)\n\"\"\"\n{\n    'y1': tensor([[0.4437, 0.7152]], grad_fn=\u003cSigmoidBackward0\u003e),\n    'y2': tensor([[0.0555, 0.9014, 0.8297]]),\n}\n\"\"\"\n```\n    \n\u003c/details\u003e\n\n    \n\u003cdetails\u003e\n\u003csummary\u003e \u003cb\u003e Graph Input/Output Summary \u003c/b\u003e \u003c/summary\u003e\n\u003cbr\u003e \n    \nNote that changing an input node might not be enough to cut the graph (there might be other dependencies connected to previous inputs). To view all inputs of the new graph we can call `model_ext.summary` which will give us an overview of all required inputs and returned outputs:\n\n```python\nimport torch\nimport torch.nn as nn\nfrom surgeon_pytorch import Extract, get_nodes\n\nclass SomeModel(nn.Module):\n\n    def __init__(self):\n        super().__init__()\n        self.layer1a = nn.Linear(2, 2)\n        self.layer1b = nn.Linear(2, 2)\n        self.layer2 = nn.Linear(2, 1)\n\n    def forward(self, x):\n        a = self.layer1a(x)\n        b = self.layer1b(x)\n        c = torch.add(a, b)\n        y = self.layer2(c)\n        return y\n\nmodel = SomeModel()\nprint(get_nodes(model)) # ['x', 'layer1a', 'layer1b', 'add', 'layer2']\n\nmodel_ext = Extract(model, node_in = {'layer1a': 'my_input'}, node_out = {'add': 'my_add'})\nprint(model_ext.summary) # {'input': ('x', 'my_input'), 'output': {'my_add': add}}\n\nout = model_ext(x = torch.rand(1, 2), my_input = torch.rand(1,2))\nprint(out) # {'my_add': tensor([[ 0.3722, -0.6843]], grad_fn=\u003cAddBackward0\u003e)}\n```\n\n\u003c/details\u003e\n    \n\u003cdetails\u003e\n\u003csummary\u003e \u003cb\u003e API \u003c/b\u003e \u003c/summary\u003e\n\u003cbr\u003e \n\n#### API\n\n```python\nmodel = Extract(\n    model: nn.Module,\n    node_in: Optional[Union[str, Sequence[str], Dict[str, str]]] = None,\n    node_out: Optional[Union[str, Sequence[str], Dict[str, str]]] = None,\n    tracer: Optional[Type[Tracer]] = None,          # Tracer class used, default: torch.fx.Tracer\n    concrete_args: Optional[Dict[str, Any]] = None, # Tracer concrete_args, default: None\n    keep_output: bool = None,                       # Set to `True` to return original outputs as first argument, default: True except if node_out are provided\n    share_modules: bool = False,                    # Set to true if you want to share module weights with original model\n)\n```\n\n\u003c/details\u003e\n\n\n### Inspect vs Extract\nThe `Inspect` class always executes the entire model provided as input, and it uses special hooks to record the tensor values as they flow through. This approach has the advantages that (1) we don't create a new module (2) it allows for a dynamic execution graph (i.e. `for` loops and `if` statements that depend on inputs). The downsides of `Inspect` are that (1) if we only need to execute part of the model some computation is wasted, and (2) we can only output values from `nn.Module` layers – no intermediate function values.\n\nThe `Extract` class builds an entirely new model using symbolic tracing. The advantages of this approach are (1) we can crop the graph anywhere and get a new model that computes only that part, (2) we can extract values from intermediate functions (not only layers), and (3) we can also change input tensors. The downside of `Extract` is that only static graphs are allowed (note that most models have static graphs).\n\n\n\n\n\n\n## TODO\n- [x] add extract function to get intermediate block\n- [x] add model inputs/outputs summary\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farchinetai%2Fsurgeon-pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Farchinetai%2Fsurgeon-pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farchinetai%2Fsurgeon-pytorch/lists"}