https://github.com/dixslyf/nbparts
Unpack a Jupyter notebook into its sources, outputs and metadata.
https://github.com/dixslyf/nbparts
data haskell jupyter jupyter-notebook nix nix-flake
Last synced: 9 months ago
JSON representation
Unpack a Jupyter notebook into its sources, outputs and metadata.
- Host: GitHub
- URL: https://github.com/dixslyf/nbparts
- Owner: dixslyf
- License: apache-2.0
- Created: 2025-07-27T14:00:53.000Z (11 months ago)
- Default Branch: master
- Last Pushed: 2025-09-15T16:37:18.000Z (9 months ago)
- Last Synced: 2025-09-15T18:06:28.877Z (9 months ago)
- Topics: data, haskell, jupyter, jupyter-notebook, nix, nix-flake
- Language: Haskell
- Homepage:
- Size: 4.19 MB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# `nbparts`
[](https://github.com/dixslyf/nbparts/actions/workflows/build.yaml)
[](https://github.com/dixslyf/nbparts/actions/workflows/build-nix.yaml)
[](https://hackage.haskell.org/package/nbparts)
`nbparts` is a tool for splitting Jupyter notebooks into its "parts":
- sources (code and Markdown content),
- outputs, and
- metadata.
These parts can be re-assembled back into an equivalent Jupyter notebook.
The goal is to make it easier to store and diff Jupyter notebooks in text-based version control systems like Git.
## Features
- **Unpack**: Split a Jupyter notebook into its sources, outputs and metadata.
- **Pack**: Reconstruct the original Jupyter notebook from unpacked parts.
- **Formats**:
- Sources can be exported as _YAML_, _JSON_ or _Markdown_.
- Outputs and metadata can be exported as _YAML_ or _JSON_.
- **Binary outputs and attachments** (e.g. PNG images, Markdown attachments) are extracted as files alongside the parts.
- **Roundtrip safety**: `unpack` followed by `pack` yields a notebook semantically equivalent to the original.
Markdown and code formatting is preserved.
The only known caveat at this point in time is that,
when re-encoding binary attachments and outputs into base64,
`nbparts` always performs line wrapping after 76 characters;
however, not all Jupyter notebook platforms perform line wrapping
on the base64 strings, so although the content reconstructed by `nbparts`
is the same, the formatting may slightly differ.
## Motivation
Jupyter notebooks are widely used for data exploration and analysis,
but because they are large JSON documents,
storing them in version control systems like Git is painful:
- Attachments, execution outputs and metadata add significant noise to diffs
and overshadow meaningful changes.
- Even after removing metadata and binary outputs from a notebook,
the diffs for small edits to code or Markdown content are a little difficult to read
due to syntactic JSON elements.
- Collaborating on notebooks is hard when every commit contains unrelated noise.
Tools like Jupytext (awesome tool!) help by representing notebook sources as plaintext.
`nbparts` complements this idea by splitting a notebook not only into its sources,
but also into its outputs and metadata, as separate parts.
This gives us more flexibility:
- If you only care about the source code and Markdown,
you can ignore the outputs and metadata.
- If outputs or metadata matter for reproducibility,
you can commit them alongside the sources.
Since attachments and binary outputs are extracted,
you may even use tools like Git LFS for versioning them.
## Installation
### Cabal
`nbparts` can be installed with Cabal.
You may first want to update Cabal's package database:
```
cabal update
```
Then, run:
```
cabal install nbparts
```
### Pre-Built Binaries
Static binaries for x86_64 Linux are available from the [releases](https://github.com/dixslyf/nbparts/releases).
Unfortunately, no binary releases are available for macOS and Windows at this point in time (contributions welcome!).
Please refer to [Compiling from Source](#compiling-from-source).
## Basic Usage
Unpack a notebook with all parts exported to YAML:
```sh
# This will create a `notebook.ipynb.nbparts` directory.
nbparts unpack notebook.ipynb
```
Pack the parts back into a notebook:
```
nbparts pack notebook.ipynb.nbparts -o notebook-repacked.ipynb
```
Unpack a notebook, with sources exported to Markdown:
```
nbparts unpack notebook.ipynb --sources-format markdown
```
For more options, see:
```
nbparts --help
```
## Compiling From Source
### Cabal
`nbparts` uses Cabal for building and packaging.
To build and install `nbparts`, ensure you have Cabal and GHC installed.
Update Cabal's package database:
```
cabal update
```
Now, clone the repository and `cd` into it. Then, run:
```
cabal install
```
### Nix
`nbparts` provides a Nix flake for building x86_64 Linux binaries.
To build:
```
nix build github:dixslyf/nbparts#nbparts
```
To run:
```
nix run github:dixslyf/nbparts#nbparts
```
Static binaries can be built using Nix and are exposed as the `nbparts-static` flake output:
```
nix build github:dixslyf/nbparts#nbparts-static
```
```
nix run github:dixslyf/nbparts#nbparts-static
```
## Running Tests
`nbparts` uses Hspec and Hedgehog for testing.
To run `nbparts`'s tests,
clone the repository and `cd` into it.
Then, run:
```
cabal test
```
### Nix
Tests can also be run with Nix:
```
nix run github:dixslyf/nbparts#nbparts:test:test-nbparts
```