{"id":14155406,"url":"https://github.com/srstevenson/nb-clean","last_synced_at":"2025-05-15T18:10:43.839Z","repository":{"id":26864513,"uuid":"110604441","full_name":"srstevenson/nb-clean","owner":"srstevenson","description":"Clean Jupyter notebooks for version control. Remove metadata, outputs, and execution counts with Git and pre-commit support.","archived":false,"fork":false,"pushed_at":"2024-10-19T14:38:23.000Z","size":570,"stargazers_count":141,"open_issues_count":0,"forks_count":19,"subscribers_count":5,"default_branch":"main","last_synced_at":"2024-10-29T17:21:45.832Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://pypi.org/project/nb-clean","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"isc","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/srstevenson.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-11-13T21:28:54.000Z","updated_at":"2024-10-28T07:26:50.000Z","dependencies_parsed_at":"2023-10-02T21:41:18.430Z","dependency_job_id":"2b686718-150a-4c9a-96bc-c9d090eeffa4","html_url":"https://github.com/srstevenson/nb-clean","commit_stats":{"total_commits":376,"total_committers":13,"mean_commits":"28.923076923076923","dds":0.5106382978723405,"last_synced_commit":"0f5e11f27e6108e643598260421cb4e66c58b3c1"},"previous_names":[],"tags_count":26,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srstevenson%2Fnb-clean","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srstevenson%2Fnb-clean/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srstevenson%2Fnb-clean/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srstevenson%2Fnb-clean/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/srstevenson","download_url":"https://codeload.github.com/srstevenson/nb-clean/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247791945,"owners_count":20996870,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["git","jupyter","jupyter-notebook","notebook","version-control"],"created_at":"2024-08-17T08:03:02.883Z","updated_at":"2025-04-08T06:33:00.525Z","avatar_url":"https://github.com/srstevenson.png","language":"Python","funding_links":[],"categories":["others"],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\u003cimg src=\"images/nb-clean.png\" width=300\u003e\u003c/p\u003e\n\n[![License](https://img.shields.io/github/license/srstevenson/nb-clean?label=License\u0026color=blue)](https://github.com/srstevenson/nb-clean/blob/main/LICENSE)\n[![GitHub release](https://img.shields.io/github/v/release/srstevenson/nb-clean?label=GitHub)](https://github.com/srstevenson/nb-clean)\n[![PyPI version](https://img.shields.io/pypi/v/nb-clean?label=PyPI)](https://pypi.org/project/nb-clean/)\n[![Python versions](https://img.shields.io/pypi/pyversions/nb-clean?label=Python)](https://pypi.org/project/nb-clean/)\n[![CI status](https://github.com/srstevenson/nb-clean/workflows/CI/badge.svg)](https://github.com/srstevenson/nb-clean/actions)\n[![Coverage](https://img.shields.io/codecov/c/gh/srstevenson/nb-clean?label=Coverage)](https://app.codecov.io/gh/srstevenson/nb-clean)\n\nnb-clean cleans Jupyter notebooks of cell execution counts, metadata, outputs,\nand (optionally) empty cells, preparing them for committing to version control.\nIt provides both a Git filter and pre-commit hook to automatically clean\nnotebooks before they're staged, and can also be used with other version control\nsystems, as a command line tool, and as a Python library. It can determine if a\nnotebook is clean or not, which can be used as a check in your continuous\nintegration pipelines.\n\n\u003e [!NOTE]\n\u003e\n\u003e nb-clean 2.0.0 introduced a new command line interface to make cleaning\n\u003e notebooks in place easier. If you upgrade from a previous release, you'll need\n\u003e to migrate to the new interface as described under\n\u003e [Migrating to nb-clean 2](#migrating-to-nb-clean-2).\n\n## Installation\n\nnb-clean requires Python 3.9 or later. To run the latest release of nb-clean in\nan ephemeral virtual environment, use [uv]:\n\n```bash\nuvx nb-clean\n```\n\nTo add nb-clean as a dependency to a Python project managed with uv, use:\n\n```bash\nuv add --dev nb-clean\n```\n\n## Usage\n\n### Checking\n\nYou can check if a notebook is clean with:\n\n```bash\nnb-clean check notebook.ipynb\n```\n\nor by passing the notebook contents on standard input:\n\n```bash\nnb-clean check \u003c notebook.ipynb\n```\n\nThe check can be run with the following flags:\n\n- To check for empty cells use `--remove-empty-cells` or the short form `-e`.\n- To ignore cell metadata use `--preserve-cell-metadata` or the short form `-m`.\n  This will ignore all metadata fields. You can also pass a list of fields to\n  ignore with `--preserve-cell-metadata field1 field2` or `-m field1 field2`.\n  Note that when _not_ passing a list of fields, either the `-m` or\n  `--preserve-cell-metadata` flag must be passed _after_ the notebook paths to\n  process, or the notebook paths should be preceded with `--` so they are not\n  interpreted as metadata fields.\n- To ignore cell outputs use `--preserve-cell-outputs` or the short form `-o`.\n- To ignore cell execution counts use `--preserve-execution-counts` or the short\n  form `-c`.\n- To ignore language version notebook metadata use\n  `--preserve-notebook-metadata` or the short form `-n`.\n- To check the notebook does not contain any notebook metadata use\n  `--remove-all-notebook-metadata` or the short form `-M`.\n\nFor example, to check if a notebook is clean whilst ignoring notebook metadata:\n\n```bash\nnb-clean check --preserve-notebook-metadata notebook.ipynb\n```\n\nTo check if a notebook is clean whilst ignoring all cell metadata:\n\n```bash\nnb-clean check --preserve-cell-metadata -- notebook.ipynb\n```\n\nTo check if a notebook is clean whilst ignoring only the `tags` cell metadata\nfield:\n\n```bash\nnb-clean check --preserve-cell-metadata tags -- notebook.ipynb\n```\n\nnb-clean will exit with status code 0 if the notebook is clean, and status code\n1 if it is not. nb-clean will also print details of cell execution counts,\nmetadata, outputs, and empty cells it finds.\n\n### Cleaning (interactive)\n\nYou can clean a Jupyter notebook with:\n\n```bash\nnb-clean clean notebook.ipynb\n```\n\nThis cleans the notebook in place. You can also pass the notebook content on\nstandard input, in which case the cleaned notebook is written to standard\noutput:\n\n```bash\nnb-clean clean \u003c original.ipynb \u003e cleaned.ipynb\n```\n\nThe cleaning can be run with the following flags:\n\n- To remove empty cells use `--remove-empty-cells` or the short form `-e`.\n- To preserve cell metadata use `--preserve-cell-metadata` or the short form\n  `-m`. This will preserve all metadata fields. You can also pass a list of\n  fields to preserve with `--preserve-cell-metadata field1 field2` or\n  `-m field1 field2`. Note that when _not_ passing a list of fields, either the\n  `-m` or `--preserve-cell-metadata` flag must be passed _after_ the notebook\n  paths to process, or the notebook paths should be preceded with `--` so they\n  are not interpreted as metadata fields.\n- To preserve cell outputs use `--preserve-cell-outputs` or the short form `-o`.\n- To preserve cell execution counts use `--preserve-execution-counts` or the\n  short form `-c`.\n- To preserve notebook metadata (such as language version) use\n  `--preserve-notebook-metadata` or the short form `-n`.\n- To remove all notebook metadata use `--remove-all-notebook-metadata` or the\n  short form `-M`.\n\nFor example, to clean a notebook whilst preserving notebook metadata:\n\n```bash\nnb-clean clean --preserve-notebook-metadata notebook.ipynb\n```\n\nTo clean a notebook whilst preserving all cell metadata:\n\n```bash\nnb-clean clean --preserve-cell-metadata -- notebook.ipynb\n```\n\nTo clean a notebook whilst preserving only the `tags` cell metadata field:\n\n```bash\nnb-clean clean --preserve-cell-metadata tags -- notebook.ipynb\n```\n\n### Cleaning (Git filter)\n\nTo add a filter to an existing Git repository to automatically clean notebooks\nwhen they're staged, run the following from the working tree:\n\n```bash\nnb-clean add-filter\n```\n\nThis will configure a filter to remove cell execution counts, metadata, and\noutputs. To also remove empty cells, use:\n\n```bash\nnb-clean add-filter --remove-empty-cells\n```\n\nTo preserve cell metadata, such as that required by tools such as [papermill],\nuse:\n\n```bash\nnb-clean add-filter --preserve-cell-metadata\n```\n\nTo preserve only specific cell metadata, e.g., `tags` and `special`, use:\n\n```bash\nnb-clean add-filter --preserve-cell-metadata tags special\n```\n\nTo preserve cell outputs, use:\n\n```bash\nnb-clean add-filter --preserve-cell-outputs\n```\n\nTo preserve cell execution counts, use:\n\n```bash\nnb-clean add-filter --preserve-execution-counts\n```\n\nTo preserve notebook `language_info.version` metadata, use:\n\n```bash\nnb-clean add-filter --preserve-notebook-metadata\n```\n\nBy default, nb-clean will not delete all notebook metadata. To completely remove\nall notebook metadata:\n\n```bash\nnb-clean add-filter --remove-all-notebook-metadata\n```\n\nnb-clean will configure a filter in the Git repository in which it is run, and\nwon't mutate your global or system Git configuration. To remove the filter, run:\n\n```bash\nnb-clean remove-filter\n```\n\n### Cleaning (pre-commit hook)\n\nnb-clean can also be used as a [pre-commit] hook. You may prefer this to the Git\nfilter if your project already uses the pre-commit framework.\n\nNote that the Git filter and pre-commit hook work differently, with different\neffects on your working directory. The pre-commit hook operates on the notebook\non disk, cleaning the copy in your working directory. The Git filter cleans\nnotebooks as they are added to the index, leaving the copy in your working\ndirectory dirty. This means cell outputs are still visible to you in your local\nJupyter instance when using the Git filter, but not when using the pre-commit\nhook.\n\nAfter installing [pre-commit], add the nb-clean hook by adding the following\nsnippet to `.pre-commit-config.yaml` in the root of your repository:\n\n```yaml\nrepos:\n  - repo: https://github.com/srstevenson/nb-clean\n    rev: 4.0.1\n    hooks:\n      - id: nb-clean\n```\n\nYou can pass additional arguments to nb-clean with an `args` array. The\nfollowing example shows how to preserve only two specific metadata fields. Note\nthat, in the example, the final item `--` in the arg list is mandatory. The\noption `--preserve-cell-metadata` may take an arbitrary number of field\narguments, and the `--` argument is needed to separate them from notebook\nfilenames, which `pre-commit` will append to the list of arguments.\n\n```yaml\nrepos:\n  - repo: https://github.com/srstevenson/nb-clean\n    rev: 4.0.1\n    hooks:\n      - id: nb-clean\n        args:\n          - --remove-empty-cells\n          - --preserve-cell-metadata\n          - tags\n          - slideshow\n          - --\n```\n\nRun `pre-commit install` to ensure the hook is installed, and\n`pre-commit autoupdate` to update the hook to the latest release of nb-clean.\n\n### Preserving all nbformat metadata\n\nTo ignore or preserve specifically the metadata defined in the\n[`nbformat` documentation](https://nbformat.readthedocs.io/en/latest/format_description.html#cell-metadata),\nuse the following options:\n`--preserve-cell-metadata collapsed scrolled deletable editable format name tags jupyter execution`.\n\n### Migrating to nb-clean 2\n\nThe following table maps from the command line interface of nb-clean 1.6.0 to\nthat of nb-clean \u003e=2.0.0.\n\nThe examples in the table use long flags, but short flags can also be used\ninstead.\n\n| Description                                 | nb-clean 1.6.0                                                   | nb-clean \u003e=2.0.0                                            |\n| ------------------------------------------- | ---------------------------------------------------------------- | ----------------------------------------------------------- |\n| Clean notebook                              | `nb-clean clean --input notebook.ipynb \\| sponge notebook.ipynb` | `nb-clean clean notebook.ipynb`                             |\n| Clean notebook (remove empty cells)         | `nb-clean clean --input notebook.ipynb --remove-empty`           | `nb-clean clean --remove-empty-cells notebook.ipynb`        |\n| Clean notebook (preserve all cell metadata) | `nb-clean clean --input notebook.ipynb --preserve-metadata`      | `nb-clean clean --preserve-cell-metadata -- notebook.ipynb` |\n| Check notebook                              | `nb-clean check --input notebook.ipynb`                          | `nb-clean check notebook.ipynb`                             |\n| Check notebook (ignore non-empty cells)     | `nb-clean check --input notebook.ipynb --remove-empty`           | `nb-clean check --remove-empty-cells notebook.ipynb`        |\n| Check notebook (ignore all cell metadata)   | `nb-clean check --input notebook.ipynb --preserve-metadata`      | `nb-clean check --preserve-cell-metadata -- notebook.ipynb` |\n| Add Git filter to clean notebooks           | `nb-clean configure-git`                                         | `nb-clean add-filter`                                       |\n| Remove Git filter                           | `nb-clean unconfigure-git`                                       | `nb-clean remove-filter`                                    |\n\n## Copyright\n\nCopyright © Scott Stevenson.\n\nnb-clean is distributed under the terms of the [ISC license].\n\n[isc license]: https://opensource.org/licenses/ISC\n[papermill]: https://papermill.readthedocs.io/\n[pre-commit]: https://pre-commit.com/\n[uv]: https://docs.astral.sh/uv/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsrstevenson%2Fnb-clean","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsrstevenson%2Fnb-clean","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsrstevenson%2Fnb-clean/lists"}