{"id":13456027,"url":"https://github.com/google-deepmind/penzai","last_synced_at":"2025-05-13T23:06:50.192Z","repository":{"id":234010093,"uuid":"781756197","full_name":"google-deepmind/penzai","owner":"google-deepmind","description":"A JAX research toolkit for building, editing, and visualizing neural networks.","archived":false,"fork":false,"pushed_at":"2025-04-26T07:06:25.000Z","size":507636,"stargazers_count":1779,"open_issues_count":10,"forks_count":62,"subscribers_count":18,"default_branch":"main","last_synced_at":"2025-05-11T10:52:31.733Z","etag":null,"topics":["fine-tuning","interpretability","jax","neural-networks","visualization"],"latest_commit_sha":null,"homepage":"https://penzai.readthedocs.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/google-deepmind.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-04-04T01:13:03.000Z","updated_at":"2025-05-10T14:08:54.000Z","dependencies_parsed_at":"2024-05-05T21:24:17.697Z","dependency_job_id":"9cab57e7-751b-4625-b27f-16655d31daab","html_url":"https://github.com/google-deepmind/penzai","commit_stats":{"total_commits":75,"total_committers":6,"mean_commits":12.5,"dds":"0.10666666666666669","last_synced_commit":"fda6cd1e6883348ce7ff705d78149bc19631e93a"},"previous_names":["google-deepmind/penzai"],"tags_count":25,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fpenzai","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fpenzai/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fpenzai/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fpenzai/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/google-deepmind","download_url":"https://codeload.github.com/google-deepmind/penzai/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254040957,"owners_count":22004637,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fine-tuning","interpretability","jax","neural-networks","visualization"],"created_at":"2024-07-31T08:01:15.092Z","updated_at":"2025-05-13T23:06:45.156Z","avatar_url":"https://github.com/google-deepmind.png","language":"Python","funding_links":[],"categories":["Python","Mechanistic interpretability libraries","Libraries"],"sub_categories":[],"readme":"# Penzai\n\n\u003e **盆 (\"pen\", tray) 栽 (\"zai\", planting)** - *an ancient Chinese art of forming\n  trees and landscapes in miniature, also called penjing and an ancestor of the\n  Japanese art of bonsai.*\n\nPenzai is a JAX library for writing models as legible, functional pytree data\nstructures, along with tools for visualizing, modifying, and analyzing them.\nPenzai focuses on **making it easy to do stuff with models after they have been\ntrained**, making it a great choice for research involving reverse-engineering\nor ablating model components, inspecting and probing internal activations,\nperforming model surgery, debugging architectures, and more. (But if you just\nwant to build and train a model, you can do that too!)\n\nWith Penzai, your neural networks could look like this:\n\n![Screenshot of the Gemma model in Penzai](docs/_static/readme_teaser.png)\n\nPenzai is structured as a collection of modular tools, designed together but\neach useable independently:\n\n\n* A superpowered interactive Python pretty-printer:\n\n  * [Treescope](https://treescope.readthedocs.io/en/stable/) (`pz.ts`):\n    A drop-in replacement for the ordinary IPython/Colab renderer, originally\n    a part of Penzai but now available as a standalone package. It's designed to\n    help understand Penzai models and other deeply-nested JAX pytrees, with\n    built-in support for visualizing arbitrary-dimensional NDArrays.\n\n* A set of JAX tree and array manipulation utilities:\n\n  * `penzai.core.selectors` (`pz.select`): A pytree swiss-army-knife,\n    generalizing JAX's `.at[...].set(...)` syntax to arbitrary type-driven\n    pytree traversals, and making it easy to do complex rewrites or\n    on-the-fly patching of Penzai models and other data structures.\n\n  * `penzai.core.named_axes` (`pz.nx`): A lightweight named axis system which\n    lifts ordinary JAX functions to vectorize over named axes, and allows you to\n    seamlessly switch between named and positional programming styles without\n    having to learn a new array API.\n\n* A declarative combinator-based neural network library, where models are\n  represented as easy-to-modify data structures:\n\n  * `penzai.nn` (`pz.nn`): An alternative to other neural network libraries like\n    Flax, Haiku, Keras, or Equinox, which exposes the full structure of your model's\n    forward pass using declarative combinators. Like Equinox, models are\n    represented as JAX PyTrees, which means you can see everything your model\n    does by pretty printing it, and inject new runtime logic with `jax.tree_util`.\n    However, `penzai.nn` models may also contain mutable variables at the leaves\n    of the tree, allowing them to keep track of mutable state and parameter\n    sharing.\n\n* A modular implementation of common Transformer architectures, to support\n  research into interpretability, model surgery, and training dynamics:\n\n  * `penzai.models.transformer`: A reference Transformer implementation that\n  can load the pre-trained weights for the Gemma, Llama, Mistral, and\n  GPT-NeoX / Pythia architectures. Built using modular components and named\n  axes, to simplify complex model-manipulation workflows.\n\nDocumentation on Penzai can be found at\n[https://penzai.readthedocs.io](https://penzai.readthedocs.io).\n\n\u003e [!IMPORTANT]\n\u003e Penzai 0.2 includes a number of breaking changes to the neural network API.\n\u003e These changes are intended to simplify common workflows\n\u003e by introducing first-class support for mutable state and parameter sharing\n\u003e and removing unnecessary boilerplate. You can read about the differences\n\u003e between the old \"V1\" API and the current \"V2\" API in the\n\u003e [\"Changes in the V2 API\"][v2_differences] overview.\n\u003e\n\u003e If you are currently using the V1 API and have not yet converted to the V2\n\u003e system, you can instead keep the old behavior by importing from the\n\u003e `penzai.deprecated.v1` submodule, e.g. ::\n\u003e\n\u003e ```python\n\u003e from penzai.deprecated.v1 import pz\n\u003e from penzai.deprecated.v1.example_models import simple_mlp\n\u003e ```\n\n[v2_differences]: https://penzai.readthedocs.io/en/stable/guides/v2_differences.html\n\n\n## Getting Started\n\nIf you haven't already installed JAX, you should do that first, since the\ninstallation process depends on your platform. You can find instructions in the\n[JAX documentation](https://jax.readthedocs.io/en/latest/installation.html).\nAfterward, you can install Penzai using\n\n```python\npip install penzai\n```\n\nand import it using\n\n```python\nimport penzai\nfrom penzai import pz\n```\n\n(`penzai.pz` is an *alias namespace*, which makes it easier to reference\ncommon Penzai objects.)\n\nWhen working in an Colab or IPython notebook, we recommend also configuring\nTreescope (Penzai's companion pretty-printer) as the default pretty printer, and\nenabling some utilities for interactive use:\n\n```python\nimport treescope\ntreescope.basic_interactive_setup(autovisualize_arrays=True)\n```\n\nHere's how you could initialize and visualize a simple neural network:\n\n```python\nfrom penzai.models import simple_mlp\nmlp = simple_mlp.MLP.from_config(\n    name=\"mlp\",\n    init_base_rng=jax.random.key(0),\n    feature_sizes=[8, 32, 32, 8]\n)\n\n# Models and arrays are visualized automatically when you output them from a\n# Colab/IPython notebook cell:\nmlp\n```\n\nHere's how you could capture and extract the activations after the elementwise\nnonlinearities:\n\n```python\n@pz.pytree_dataclass\nclass AppendIntermediate(pz.nn.Layer):\n  saved: pz.StateVariable[list[Any]]\n  def __call__(self, x: Any, **unused_side_inputs) -\u003e Any:\n    self.saved.value = self.saved.value + [x]\n    return x\n\nvar = pz.StateVariable(value=[], label=\"my_intermediates\")\n\n# Make a copy of the model that saves its activations:\nsaving_model = (\n    pz.select(mlp)\n    .at_instances_of(pz.nn.Elementwise)\n    .insert_after(AppendIntermediate(var))\n)\n\noutput = saving_model(pz.nx.ones({\"features\": 8}))\nintermediates = var.value\n```\n\nTo learn more about how to build and manipulate neural networks with Penzai,\nwe recommend starting with the [\"How to Think in Penzai\" tutorial][how_to_think]\nor one of the other tutorials in the [Penzai documentation][].\n\n[how_to_think]: https://penzai.readthedocs.io/en/stable/notebooks/how_to_think_in_penzai.html\n[Penzai documentation]: https://penzai.readthedocs.io\n\n## Citation\n\nIf you have found Penzai to be useful for your research, please consider\nciting the following writeup (also avaliable on [arXiv](https://arxiv.org/abs/2408.00211)):\n\n```\n@article{johnson2024penzai,\n    author={Daniel D. Johnson},\n    title={{Penzai} + {Treescope}: A Toolkit for Interpreting, Visualizing, and Editing Models As Data},\n    year={2024},\n    journal={ICML 2024 Workshop on Mechanistic Interpretability}\n}\n```\n\n---\n\n*This is not an officially supported Google product.*\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-deepmind%2Fpenzai","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgoogle-deepmind%2Fpenzai","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-deepmind%2Fpenzai/lists"}