{"id":26054817,"url":"https://github.com/continuous-foundation/nbtx","last_synced_at":"2026-03-01T00:32:32.365Z","repository":{"id":37086300,"uuid":"470343780","full_name":"continuous-foundation/nbtx","owner":"continuous-foundation","description":"Transform and manipulate Jupyter (ipynb) notebook data structures","archived":false,"fork":false,"pushed_at":"2025-09-09T22:51:20.000Z","size":1061,"stargazers_count":7,"open_issues_count":1,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-09-17T02:47:46.465Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/continuous-foundation.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-03-15T21:56:20.000Z","updated_at":"2025-04-30T23:08:34.000Z","dependencies_parsed_at":"2025-04-11T02:51:02.785Z","dependency_job_id":null,"html_url":"https://github.com/continuous-foundation/nbtx","commit_stats":{"total_commits":71,"total_committers":3,"mean_commits":"23.666666666666668","dds":"0.23943661971830987","last_synced_commit":"f48fcc008d8d929c334c428f0593f00e6d01a905"},"previous_names":["continuous-foundation/nbtx"],"tags_count":17,"template":false,"template_full_name":null,"purl":"pkg:github/continuous-foundation/nbtx","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/continuous-foundation%2Fnbtx","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/continuous-foundation%2Fnbtx/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/continuous-foundation%2Fnbtx/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/continuous-foundation%2Fnbtx/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/continuous-foundation","download_url":"https://codeload.github.com/continuous-foundation/nbtx/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/continuous-foundation%2Fnbtx/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29956195,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-28T22:53:01.873Z","status":"ssl_error","status_checked_at":"2026-02-28T22:52:50.699Z","response_time":90,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-08T10:00:07.572Z","updated_at":"2026-03-01T00:32:32.335Z","avatar_url":"https://github.com/continuous-foundation.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# nbtx: Jupyter Notebook Transformation Library\n\n[![nbtx on npm](https://img.shields.io/npm/v/nbtx.svg)](https://www.npmjs.com/package/nbtx)\n[![MIT License](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/curvenote/nbtx/blob/main/LICENSE)\n[![CI](https://github.com/curvenote/nbtx/workflows/CI/badge.svg)](https://github.com/curvenote/nbtx/actions)\n\nTransform Jupyter notebook JSON files (`*.ipynb`) to and from more compact data structures for use in web applications or other contexts where loading component parts (e.g. images, data, etc.) is preferred. For example, in pulling apart a notebook in a publishing workflow the images, interactive charts or other outputs are required either on-disk or through a specific web-request.\n\n## Driving Use Cases\n\n1. Optimize a notebook for a viewing context, so that initial network payload is small (no images, html, data), allowing large components to be loaded lazily.\n2. Identify and extract known output images, html and other data for other formats (e.g. JATS, LaTeX, Word), where the images and outputs are required to be accessed independently.\n3. Allow for additional, post-processed mimetypes to be added to the transformed notebook (e.g. WebP, thumbnail images) while maintaining a transformation path back to original notebook.\n\n## Scope\n\nThe scope of this library is currently isolated to \"minifying\" large notebook cell outputs, including `stream`, `error`, and mimetype outputs (`update_display_data`, `display_data`, `execute_result`). Large outputs are extracted from the notebook JSON, moved to a cache data structure, and referenced in the notebook by their `hash` and `content_type`. This library also provides a function to restore notebook outputs to their original state, given minifed outputs and the cached output content.\n\nThis library uses existing notebook types defined in [nbformat](https://github.com/jupyterlab/jupyterlab/tree/master/packages/nbformat) (see [docs](https://nbformat.readthedocs.io)); the only new types defined in `nbtx` are for \"minified\" outputs. However, there are no functions for handling entire notebooks; outputs must be isolated prior to invoking `nbtx` functions. This choice allows the library to be used in non-notebook contexts (e.g. [MyST Markdown](https://myst-tools.org)), which include output mime-bundles, but does not conform to the full notebook specification.\n\n## Goals\n\n- Stay as close as possible to the `nbformat` for defining outputs.\n- Identify and transforming outputs; `nbtx` does not write files to disk or fetch pieces of a notebook.\n- Identify and extract large stream and error outputs, the length can be customized depending on use case.\n\n## Installation\n\nInstall using `npm` or `yarn`\n\n```\nnpm install nbtx\n```\n\n## Usage\n\nThe following example loads a notebook, then iterates through each cell and, if outputs are present, mutates the cells to include minified `output` objects that reference a separate `outputCache`:\n\n```typescript\nimport fs from 'fs';\nimport type { MinifiedContentCache, MinifyOptions } from 'nbtx';\nimport { minifyCellOutput } from 'nbtx';\n\nconst notebook = JSON.parse(fs.readFileSync('my-notebook.ipynb'));\nconst outputCache: MinifiedContentCache = {};\n// Options for minification, see note on hashing below\nconst opts: Partial\u003cMinifyOptions\u003e = { computeHash };\n\nnotebook.cells.forEach((cell) =\u003e {\n  if (!cell.outputs?.length) return;\n  cell.outputs = minifyCellOutput(cell.outputs, outputCache);\n});\n```\n\nYou may then handle the `outputCache` however you want. For example, writing each large output to its own file and updating the cell outputs to point to those files (in this example by adding the `path` field):\n\n```typescript\nimport { extFromMimeType, walkOutputs } from 'nbtx';\n\nnotebook.cells.forEach((cell) =\u003e {\n  if (!cell.outputs?.length) return;\n  walkOutputs(cell.outputs, (output) =\u003e {\n    if (!output.hash || !outputCache[output.hash]) return;\n    const [content, { contentType, encoding }] = outputCache[hash];\n    const filename = `${hash}${extFromMimeType(contentType)}`;\n    fs.writeFileSync(filename, content, { encoding: encoding as BufferEncoding });\n    // The path can be used, for example in a web-context\n    output.path = filename;\n  });\n});\n```\n\nYou may also rehydrate the original notebook from an `outputCache`:\n\n```typescript\nimport { convertToIOutputs } from 'nbtx';\n\nnotebook.cells.forEach((cell) =\u003e {\n  if (!cell.outputs?.length) return;\n  cell.outputs = convertToIOutputs(cell.outputs, outputCache);\n});\n```\n\n\u003e **Note**\n\u003e Minifying and restoring notebook outputs may change the structure of output text from a string list to a single,\n\u003e new-line-delimited string. Both of these formats are acceptable in the notebook types defined by `nbformat`.\n\n## Hashing function\n\nTo be able to have no dependencies and also run easily in the browser, `nbtx` does not bundle a hashing library.\nTo create the `computeHash` function, choose an algorithm, for example, `md5` and digest the content. If you are in the browser, consider using `crypto-js` or some other random function.\n\n```typescript\nimport { createHash } from 'crypto';\n\nfunction computeHash(content: string): string {\n  return createHash('md5').update(content).digest('hex');\n}\n```\n\nBy default `nbtx` will create a random string for the hash and raise a warning.\n\n## Data transformation example\n\nStarting with an `ipynb` JSON document, the following example shows the output transformation for an `execute_result` with three outputs (html, image, text):\n\n```json\n{\n  ...,\n  \"cells\": [\n    {\n      \"cell_type\": \"code\",\n      ...,\n      \"outputs\": {\n        \"output_type\": \"execute_result\",\n        ...,\n        \"data\": {\n          \"text/html\": [\"...veryLargeString\\n\", \"on many lines\\n\"],\n          \"image/png\": \"base64-encoded-data-without-a-header\",\n          \"text/plain\": [\"alt.VConcatChart(...)\"],\n        }\n      }\n    }\n  ],\n  ...\n}\n```\n\nAfter `minifyCellOutput` is called and an optional pass to write to disk and add a `path` (as in the above example), the JSON structure would be:\n\n```json\n{\n  ...,\n  \"cells\": [\n    {\n      \"cell_type\": \"code\",\n      ...,\n      \"outputs\": {\n        \"output_type\": \"execute_result\",\n        ...,\n        \"data\": {\n          \"text/html\": {\n            \"content_type\": \"text/html\",\n            \"hash\": \"29cb113f927eb3abba1b303571caa653\",\n            // The path isn't added by nbtx, but is a common place to put a URL\n            \"path\": \"/static/29cb113f927eb3abba1b303571caa653.html\"\n          },\n          \"image/png\": {\n            \"content_type\": \"image/png\",\n            \"hash\":  \"W5Zulz9J5PLlOkjN2RWMa6CRgJdjxq2r\",\n            // Known output types are given sensible extensions through `extFromMimeType`\n            \"path\": \"/static/W5Zulz9J5PLlOkjN2RWMa6CRgJdjxq2r.png\"\n          },\n          \"text/plain\": {\n            // Small strings are by default not extracted, this can be modified in options\n            \"content\": \"alt.VConcatChart(...)\",\n            \"content_type\": \"text/plain\"\n          }\n        }\n      }\n    }\n  ],\n  ...\n}\n```\n\nViewing and \"rehydration\" applications can choose to `walkOutputs` and download the various parts of a notebook, and/or add additional `mimetypes` to the bundle. For example, adding transformations to take screenshots of outputs for long-term preservation or add web-optimized images (e.g. WebP) that were not created in the execution process.\n\nThis can be done asyncronously from the first request of notebook content payload, improving pageload speed and leaving it up to the consuming application which of the mime-bundles to fetch.\n\n---\n\nAs of v0.4.0 this package is [ESM only](https://gist.github.com/sindresorhus/a39789f98801d908bbc7ff3ecc99d99c).\n\n---\n\n\u003cp style=\"text-align: center; color: #aaa; padding-top: 50px\"\u003e\n  Made with love by\n  \u003ca href=\"https://continuous.foundation\" target=\"_blank\" style=\"color: #aaa\"\u003e\n    Continuous Science Foundation \u003cimg src=\"https://continuous.foundation/images/logo-small.svg\" style=\"height: 1em\" /\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcontinuous-foundation%2Fnbtx","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcontinuous-foundation%2Fnbtx","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcontinuous-foundation%2Fnbtx/lists"}