{"id":26256388,"url":"https://github.com/google-deepmind/dks","last_synced_at":"2025-10-03T19:52:38.423Z","repository":{"id":37459787,"uuid":"463631084","full_name":"google-deepmind/dks","owner":"google-deepmind","description":"Multi-framework implementation of Deep Kernel Shaping and Tailored Activation Transformations, which are methods that modify neural network models (and their initializations) to make them easier to train.","archived":false,"fork":false,"pushed_at":"2025-07-01T11:04:46.000Z","size":1292,"stargazers_count":72,"open_issues_count":2,"forks_count":6,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-09-25T10:56:23.258Z","etag":null,"topics":["artificial-intelligence","deep-learning","jax","machine-learning","neural-networks","neural-tangent-kernel","pytorch","tensorflow"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/google-deepmind.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-02-25T18:07:46.000Z","updated_at":"2025-09-15T07:24:51.000Z","dependencies_parsed_at":"2024-06-18T03:28:53.746Z","dependency_job_id":"2dde1eed-30ad-4b05-9ba4-4951db8a7155","html_url":"https://github.com/google-deepmind/dks","commit_stats":null,"previous_names":["google-deepmind/dks","deepmind/dks"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/google-deepmind/dks","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fdks","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fdks/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fdks/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fdks/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/google-deepmind","download_url":"https://codeload.github.com/google-deepmind/dks/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fdks/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278219765,"owners_count":25950350,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-03T02:00:06.070Z","response_time":53,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","deep-learning","jax","machine-learning","neural-networks","neural-tangent-kernel","pytorch","tensorflow"],"created_at":"2025-03-13T20:17:47.050Z","updated_at":"2025-10-03T19:52:38.391Z","avatar_url":"https://github.com/google-deepmind.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"![CI status](https://github.com/deepmind/dks/workflows/ci/badge.svg)\n![pypi](https://img.shields.io/pypi/v/dks)\n\n# Official Python package for Deep Kernel Shaping (DKS) and Tailored Activation Transformations (TAT)\n\nThis Python package implements the activation function transformations, weight\ninitializations, and dataset preprocessing used in Deep Kernel Shaping (DKS) and\nTailored Activation Transformations (TAT). DKS and TAT, which were introduced in\nthe [DKS paper] and [TAT paper], are methods for constructing/transforming\nneural networks to make them much easier to train. For example, these methods\ncan be used in conjunction with K-FAC to train deep vanilla deep convnets\n(without skip connections or normalization layers) as fast as standard ResNets\nof the same depth.\n\nThe package supports the JAX, PyTorch, and TensorFlow tensor programming\nframeworks.\n\nQuestions/comments about the code can be sent to\n[dks-dev@google.com](mailto:dks-dev@google.com).\n\n**NOTE:** we are not taking code contributions from Github at this time. All PRs\nfrom Github will be rejected. Instead, please email us if you find a bug.\n\n## Usage\n\nFor each of the supported tensor programming frameworks, there is a\ncorresponding subpackage which handles the activation function transformations,\nweight initializations, and (optional) data preprocessing. (These are `dks.jax`,\n`dks.pytorch`, and `dks.tensorflow`.) It's up to the user to import these and\nuse them appropriately within their model code. Activation functions are\ntransformed by the function `get_transformed_activations()` in the module\n`activation_transform` of the appropriate subpackage. Sampling initial\nparameters is done using functions inside of the module\n`parameter_sampling_functions` of said subpackage. And data preprocessing is\ndone using the function `per_location_normalization` inside of the module\n`data_preprocessing` of said subpackage. Note that in order to avoid having to\nimport all of the tensor programming frameworks, the user is required to\nindividually import whatever framework subpackage they want. e.g. `import\ndks.jax`. Meanwhile, `import dks` won't actually do anything.\n\n`get_transformed_activations()` requires the user to pass either the \"maximal\nslope function\" for DKS, the \"subnet maximizing function\" for TAT with Leaky\nReLUs, or the \"maximal curvature function\" for TAT with smooth activation\nfunctions. (The subnet maximizing function also handles DKS and TAT with smooth\nactivations.) These are special functions that encode information about the\nparticular model architecture. See the section titled \"Summary of our method\" of\nthe [DKS paper] for a procedure to construct the maximal slope function for a\ngiven model, or the appendix section titled \"Additional details and pseudocode\nfor activation function transformations\" of the [TAT paper] for procedures to\nconstruct the other two functions.\n\nIn addition to these things, the user is responsible for ensuring that their\nmodel meets the architectural requirements of DKS/TAT, and for converting any\nweighted sums into \"normalized sums\" (which are weighted sums whose\nnon-trainable weights have a sum of squares equal to 1). See the section titled\n\"Summary of our method\" of the [DKS paper] for more details.\n\nNote that the data preprocessing method implemented, called Per-Location \nNormalization (PLN), may not always be needed in practice, but we have observed\ncertain situations where not using can lead to problems. (For example, training\non datasets that contain all-zero pixels, such as CIFAR-10.) Also\nnote that ReLUs are only partially supported by DKS, and unsupported by TAT, and\nso their use is *highly* discouraged. Instead, one should use Leaky ReLUs, which\nare fully supported by DKS, and work especially well with TAT.\n\n## Example\n\n`dks.examples.haiku.modified_resnet` is a [Haiku] ResNet model which has been\nmodified as described in the DKS/TAT papers, and includes support for both DKS\nand TAT. When constructed with its default arguments, it removes the\nnormalization layers and skip connections found in standard ResNets, making it a\n\"vanilla network\". It can be used as an instructive example for how to build\nDKS/TAT models using this package. See the section titled \"Application to\nvarious modified ResNets\" from the [DKS paper] for more details.\n\n## Installation\n\nThis package can be installed directly from GitHub using `pip` with\n\n```bash\npip install git+https://github.com/deepmind/dks.git\n```\n\nor\n\n```bash\npip install -e git+https://github.com/deepmind/dks.git#egg=dks[\u003cextras\u003e]\n```\n\nOr from PyPI with\n\n```bash\npip install dks\n```\n\nor\n\n```bash\npip install dks[\u003cextras\u003e]\n```\n\nHere `\u003cextras\u003e` is a common-separated list of strings (with no spaces) that can\nbe passed to install extra dependencies for different tensor programming\nframeworks. Valid strings are `jax`, `tf`, and `pytorch`. So for example, to\ninstall `dks` with the extra requirements for JAX and PyTorch, one does\n\n```bash\npip install dks[jax,pytorch]\n```\n\n## Testing\n\nTo run tests in a Python virtual environment with specific pinned versions of\nall the dependencies one can do:\n\n```bash\ngit clone https://github.com/deepmind/dks.git\ncd dks\n./test.sh\n```\n\nHowever, it is strongly recommended that you run the tests in the same Python\nenvironment (with the same package versions) as you plan to actually use `dks`.\nThis can be accomplished by installing `dks` for all three tensors programming\nframeworks (e.g. with `pip install dks[jax,pytorch,tf]` or some other\ninstallation method), and then doing\n\n```bash\npip install pytest-xdist\ngit clone https://github.com/deepmind/dks.git\ncd dks\npython -m pytest -n 16 tests\n```\n\n## Disclaimer\n\nThis is not an official Google product.\n\n[DKS paper]: https://arxiv.org/abs/2110.01765\n[TAT paper]: https://openreview.net/forum?id=U0k7XNTiFEq\n[Haiku]: https://github.com/deepmind/dm-haiku\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-deepmind%2Fdks","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgoogle-deepmind%2Fdks","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-deepmind%2Fdks/lists"}