{"id":13935993,"url":"https://github.com/google-research/morph-net","last_synced_at":"2025-05-14T19:06:17.542Z","repository":{"id":34658377,"uuid":"175291625","full_name":"google-research/morph-net","owner":"google-research","description":"Fast \u0026 Simple Resource-Constrained Learning of Deep Network Structure","archived":false,"fork":false,"pushed_at":"2025-01-29T14:09:17.000Z","size":6432,"stargazers_count":1029,"open_issues_count":24,"forks_count":151,"subscribers_count":34,"default_branch":"master","last_synced_at":"2025-04-06T13:02:10.714Z","etag":null,"topics":["automl","deep-learning","machine-learning","neural-architecture-search","python","tensorflow"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/google-research.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-03-12T20:30:12.000Z","updated_at":"2025-04-04T02:15:48.000Z","dependencies_parsed_at":"2024-01-09T16:08:41.305Z","dependency_job_id":"ddeda095-295e-445a-bdc2-c4e0127ae7ec","html_url":"https://github.com/google-research/morph-net","commit_stats":{"total_commits":126,"total_committers":20,"mean_commits":6.3,"dds":0.6825396825396826,"last_synced_commit":"812c626c06e37d72b5cadf52de948a525665a7ac"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fmorph-net","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fmorph-net/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fmorph-net/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fmorph-net/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/google-research","download_url":"https://codeload.github.com/google-research/morph-net/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248741148,"owners_count":21154249,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["automl","deep-learning","machine-learning","neural-architecture-search","python","tensorflow"],"created_at":"2024-08-07T23:02:16.894Z","updated_at":"2025-04-13T15:51:11.160Z","avatar_url":"https://github.com/google-research.png","language":"Python","readme":"# MorphNet: Fast \u0026 Simple Resource-Constrained Learning of Deep Network Structure\n\n[TOC]\n\n## New: FiGS: Fine-Grained Stochastic Architecture Search\nFiGS, is a probabilistic approach to channel regularization that we introduced\nin [Fine-Grained Stochastic Architecture Search](https://arxiv.org/pdf/2006.09581.pdf).\nIt outperforms our previous regularizers and can be used as either a pruning algorithm or\na full fledged Differentiable Architecture Search method. This is the recommended\nway to apply MorphNet. In the below documentation it is\nreferred to as the `LogisticSigmoid` regularizer.\n\n\n## What is MorphNet?\nMorphNet is a method for learning deep network structure during training. The\nkey principle is continuous relaxation of the network-structure learning\nproblem. In short, the MorphNet regularizer pushes the influence of filters down,\nand once they are small enough, the corresponding output channels are marked\nfor removal from the network.\n\n\nSpecifically, activation sparsity is induced by adding regularizers\nthat target the consumption of specific resources such as FLOPs or model size.\nWhen the regularizer loss is added to the training loss and their sum is\nminimized via stochastic gradient descent or a similar optimizer, the learning\nproblem becomes a constrained optimization of the structure of the network,\nunder the constraint represented by the regularizer. The method was first\nintroduced in our [CVPR 2018](http://cvpr2018.thecvf.com/), paper \"[MorphNet: Fast \u0026 Simple Resource-Constrained Learning of\nDeep Network Structure](https://arxiv.org/abs/1711.06798)\". A overview of the\napproach as well as device-specific latency regularizers were prestend in\n[GTC 2019](https://gputechconf2019.smarteventscloud.com/connect/sessionDetail.ww?SESSION_ID=272314).  [[slides](g3doc//MorphNet_GTC2019.pdf \"GTC Slides\"), recording: [YouTube](https://youtu.be/UvTXhTvJ_wM), [GTC on-demand](https://on-demand.gputechconf.com/gtc/2019/video/_/S9645/)].\nOur new, probabilistic, approach to pruning is called FiGS, and is detailed in\n[Fine-Grained Stochastic Architecture Search](https://arxiv.org/pdf/2006.09581.pdf).\n\n\n## Usage\n\nSuppose you have a working convolutional neural network for image classification\nbut want to shrink the model to satisfy some constraints (e.g., memory,\nlatency). Given an existing model (the “seed network”) and a target criterion,\nMorphNet will propose a new model by adjusting the number of output channels in\neach convolution layer.\n\nNote that MorphNet does not change the topology of the network -- the proposed\nmodel will have the same number of layers and connectivity pattern as the seed\nnetwork.\n\nTo use MorphNet, you must:\n\n1.  Choose a regularizer from `morphnet.network_regularizers`. The choice is\n    based on\n\n    *   your target cost (e.g., FLOPs, latency)\n    *   Your ability to add new layers to your model:\n        * Add\n          our probabilistic gating operation after any layer you wish to prune, and\n          use the `LogisticSigmoid` regularizers. **\\[recommended\\]**\n        * If you are unable to add new layers, select regularizer type based on\n          your network architecture: use `Gamma` regularizer if the seed network\n          has BatchNorm; use `GroupLasso` otherwise \\[deprecated\\].\n\n    Note: If you use BatchNorm, you must enable the scale parameters (“gamma\n    variables”), i.e., by setting `scale=True` if you are using\n    `tf.keras.layers.BatchNormalization`.\n\n    Note: If you are using `LogisticSigmoid` don't forget to add the\n    probabilistic gating op! See below for example.\n\n2.  Initialize the regularizer with a threshold and the output boundary ops and\n    (optionally) the input boundary ops of your model.\n\n    MorphNet regularizer crawls your graph starting from the output boundary,\n    and applies regularization to some of the ops it encounters. When it\n    encounters any of the input boundary ops, it does not crawl past them (the\n    ops in the input boundary are not regularized). The threshold determines\n    which output channels can be eliminated.\n\n3.  Add the regularization term to your loss.\n\n    As always, regularization loss must be scaled. We recommend to search for\n    the scaling hyperparameter (*regularization strength*) along a logarithmic\n    scale spanning a few orders of magnitude around `1/(initial cost)`. For\n    example, if the seed network starts with 1e9 FLOPs, explore regularization\n    strength around 1e-9.\n\n    Note: MorphNet does not currently add the regularization loss to the\n    tf.GraphKeys.REGULARIZATION_LOSSES collection; this choice is subject to\n    revision.\n\n    Note: Do not confuse `get_regularization_term()` (the loss you should add to\n    your training) with `get_cost()` (the estimated cost of the network if the\n    proposed structure is applied).\n\n4.  Train the model.\n\n    Note: We recommend using a fixed learning rate (no decay) for this step,\n    though this is not strictly necessary.\n\n5.  Save the proposed model structure with the `StructureExporter`.\n\n    The exported files are in JSON format. Note that as the training progresses,\n    the proposed model structure will change. There are no specific guidelines\n    on the stopping time, although you would likely want to wait for the\n    regularization loss (reported via summaries) to stabilize.\n\n6.  (Optional) Create summary ops to monitor the training progress through\n    TensorBoard.\n\n7.  Modify your model using the `StructureExporter` output.\n\n8.  Retrain the model from scratch without the MorphNet regularizer.\n\n    Note: Use the standard values for all hyperparameters (such as the learning\n    rate schedule).\n\n9.  (Optional) Uniformly expand the network to adjust the accuracy vs. cost\n    trade-off as desired. Alternatively, this step can be performed *before*\n    the structure learning step.\n\nWe refer to the first round of training as *structure learning* and the second\nround as *retraining*.\n\nTo summarize, the key hyperparameters for MorphNet are:\n\n*   Regularization strength\n*   Alive threshold\n\nNote that the regularizer type is not a hyperparameter because it's uniquely\ndetermined by the metric of interest (FLOPs, latency) and the presence of\nBatchNorm.\n\n## Regularizer Types\n\nRegularizer classes can be found under `network_regularizers/` directory. They\nare named by the algorithm they use and the target cost they attempt to\nminimize. For example, `LogisticSigmoidFlopsRegularizer` uses a\nLogistic-Sigmoid probabilistic method to to regularize the FLOP cost\nand `GammaModelSizeRegularizer` uses the batch norm gamma in\norder to regularize the model size cost.\n\n### Regularizer Algorithms\n\n* **[NEW] LogisticSigmoid** is designed to control any model type, but requires\n  adding simple `gating layers` to your model.\n* **GroupLasso** is designed for models without batch norm.\n* **Gamma** is designed for models with batch norm; it requires that batch\n   norm scale is enabled.\n\n### Regularizer Target Costs\n\n* *Flops* targets the FLOP count of the inference network.\n* *Model Size* targets the number of weights of the network.\n* *Latency* optimizes for the estimated inference latency of the network, based\non the specific hardware characteristics.\n\n## Examples\n\n### Adding a FLOPs Regularizer\n\nThe example below demonstrates how to use MorphNet to reduce the number of FLOPs\nin your model. In this example, the regularizer will traverse the graph\nstarting with `logits`, and will not go past any op that is earlier in the graph\nthan the `inputs` or `labels`; this allows to specify the subgraph for MorphNet to optimize.\n\n\u003c!-- TODO Add Keras example. --\u003e\n```python\nfrom morph_net.network_regularizers import flop_regularizer\nfrom morph_net.tools import structure_exporter\n\ndef build_model(inputs, labels, is_training, ...):\n  gated_relu = activation_gating.gated_relu_activation()\n\n  net = tf.layers.conv2d(inputs, kernel=[5, 5], num_outputs=256)\n  net = gated_relu(net, is_training=is_training)\n\n  ...\n  ...\n\n  net = tf.layers.conv2d(net, kernel=[3, 3], num_outputs=1024)\n  net = gated_relu(net, is_training=is_training)\n\n  logits = tf.reduce_mean(net, [1, 2])\n  logits = tf.layers.dense(logits, units=1024)\n  return logits\n\ninputs, labels = preprocessor()\nlogits = build_model(inputs, labels, is_training=True, ...)\n\nnetwork_regularizer = flop_regularizer.LogisticSigmoidFlopsRegularizer(\n    output_boundary=[logits.op],\n    input_boundary=[inputs.op, labels.op],\n    alive_threshold=0.1  # Value in [0, 1]. This default works well for most cases.\n)\nregularization_strength = 1e-10\nregularizer_loss = (network_regularizer.get_regularization_term() * regularization_strength)\n\nmodel_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(labels, logits)\n\noptimizer = tf.train.MomentumOptimizer(learning_rate=0.01, momentum=0.9)\n\ntrain_op = optimizer.minimize(model_loss + regularizer_loss)\n```\n\nYou should monitor the progress of structure learning training via Tensorboard.\nIn particular, you should consider adding a summary that computes the current\nMorphNet regularization loss and the cost if the currently proposed structure is\nadopted.\n\n```python\ntf.summary.scalar('RegularizationLoss', regularizer_loss)\ntf.summary.scalar(network_regularizer.cost_name, network_regularizer.get_cost())\n```\n\n![TensorBoardDisplayOfFlops](g3doc/tensorboard.png \"Example of the TensorBoard display of the resource regularized by MorphNet.\")\n\nLarger values of `regularization_strength` will converge to smaller effective\nFLOP count. If `regularization_strength` is large enough, the FLOP count will\ncollapse to zero. Conversely, if it is small enough, the FLOP count will remain\nat its initial value and the network structure will not vary. The\n`regularization_strength` parameter is your knob to control where you want to be\non the price-performance curve. The `alive_threshold` parameter is used for\ndetermining when an activation is alive.\n\n### Extracting the Architecture Learned by MorphNet\n\nDuring training, you should save a JSON file that contains the learned structure\nof the network, that is the count of activations in a given layer kept alive (as\nopposed to removed) by MorphNet.\n\n```python\nexporter = structure_exporter.StructureExporter(\n    network_regularizer.op_regularizer_manager)\n\nwith tf.Session() as sess:\n  tf.global_variables_initializer().run()\n  for step in range(max_steps):\n    _, structure_exporter_tensors = sess.run([train_op, exporter.tensors])\n    if (step % 1000 == 0):\n      exporter.populate_tensor_values(structure_exporter_tensors)\n      exporter.create_file_and_save_alive_counts(train_dir, step)\n```\n\n## Misc\n\nContact: morphnet@google.com\n\n### Maintainers\n\n*   Elad Eban, github: [eladeban](https://github.com/eladeban)\n*   Andrew Poon, github: [ayp-google](https://github.com/ayp-google)\n*   Yair Movshovitz-Attias, github: [yairmov](https://github.com/yairmov)\n*   Max Moroz, github: [pkch](https://github.com/pkch)\n\n### Contributors\n\n*   Ariel Gordon, github: [gariel-google](https://github.com/gariel-google).\n","funding_links":[],"categories":["Python","神经网络结构搜索_Neural_Architecture_Search","Projects"],"sub_categories":["Distributed Frameworks"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-research%2Fmorph-net","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgoogle-research%2Fmorph-net","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-research%2Fmorph-net/lists"}