Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nteract/papermill
π Parameterize, execute, and analyze notebooks
https://github.com/nteract/papermill
julia jupyter notebook notebook-generator notebooks nteract pipeline publishing python r scala
Last synced: 6 days ago
JSON representation
π Parameterize, execute, and analyze notebooks
- Host: GitHub
- URL: https://github.com/nteract/papermill
- Owner: nteract
- License: bsd-3-clause
- Created: 2017-07-06T17:17:53.000Z (over 7 years ago)
- Default Branch: main
- Last Pushed: 2024-10-05T15:29:32.000Z (4 months ago)
- Last Synced: 2024-10-29T10:06:13.423Z (3 months ago)
- Topics: julia, jupyter, notebook, notebook-generator, notebooks, nteract, pipeline, publishing, python, r, scala
- Language: Python
- Homepage: http://papermill.readthedocs.io/en/latest/
- Size: 1.88 MB
- Stars: 5,958
- Watchers: 86
- Forks: 429
- Open Issues: 148
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
- awesome-jupyter-resources - GitHub - 30% open Β· β±οΈ 15.08.2022): (δΊ€δΊεΌε°ι¨δ»Άεε―θ§ε)
- best-of-jupyter - GitHub - 36% open Β· β±οΈ 05.10.2024): (Interactive Widgets & Visualization)
- jimsghstars - nteract/papermill - π Parameterize, execute, and analyze notebooks (Python)
- awesome-robotic-tooling - papermill - A tool for parameterizing, executing, and analyzing Jupyter Notebooks. (Documentation and Presentation)
- awesome-production-machine-learning - Papermill - Papermill is a library for parameterizing notebooks and executing them like Python scripts. (Data Science Notebook Frameworks)
- Awesome-AIML-Data-Ops - Papermill - Papermill is a library for parameterizing notebooks and executing them like Python scripts. (Data Science Notebook Frameworks)
- awesome-starred - nteract/papermill - π Parameterize, execute, and analyze notebooks (scala)
- awesome-production-machine-learning - Papermill - Papermill is a library for parameterizing notebooks and executing them like Python scripts. (DS Notebook)
- my-awesome-github-stars - nteract/papermill - π Parameterize, execute, and analyze notebooks (Python)
README
[![CI](https://github.com/nteract/papermill/actions/workflows/ci.yml/badge.svg)](https://github.com/nteract/papermill/actions/workflows/ci.yml)
[![CI](https://github.com/nteract/papermill/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/nteract/papermill/actions/workflows/ci.yml)
[![image](https://codecov.io/github/nteract/papermill/coverage.svg?branch=main)](https://codecov.io/github/nteract/papermill?branch=main)
[![Documentation Status](https://readthedocs.org/projects/papermill/badge/?version=latest)](http://papermill.readthedocs.io/en/latest/?badge=latest)
[![badge](https://tinyurl.com/ybwovtw2)](https://mybinder.org/v2/gh/nteract/papermill/main?filepath=binder%2Fprocess_highlight_dates.ipynb)
[![badge](https://tinyurl.com/y7uz2eh9)](https://mybinder.org/v2/gh/nteract/papermill/main?)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/papermill)](https://pypi.org/project/papermill/)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black)
[![papermill](https://snyk.io/advisor/python/papermill/badge.svg)](https://snyk.io/advisor/python/papermill)
[![Anaconda-Server Badge](https://anaconda.org/conda-forge/papermill/badges/downloads.svg)](https://anaconda.org/conda-forge/papermill)
[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/nteract/papermill/main.svg)](https://results.pre-commit.ci/latest/github/nteract/papermill/main)**papermill** is a tool for parameterizing, executing, and analyzing
Jupyter Notebooks.Papermill lets you:
- **parameterize** notebooks
- **execute** notebooksThis opens up new opportunities for how notebooks can be used. For
example:- Perhaps you have a financial report that you wish to run with
different values on the first or last day of a month or at the
beginning or end of the year, **using parameters** makes this task
easier.
- Do you want to run a notebook and depending on its results, choose a
particular notebook to run next? You can now programmatically
**execute a workflow** without having to copy and paste from
notebook to notebook manually.Papermill takes an *opinionated* approach to notebook parameterization and
execution based on our experiences using notebooks at scale in data
pipelines.## Installation
From the command line:
```{.sourceCode .bash}
pip install papermill
```For all optional io dependencies, you can specify individual bundles
like `s3`, or `azure` -- or use `all`. To use Black to format parameters you can add as an extra requires \['black'\].```{.sourceCode .bash}
pip install papermill[all]
```## Python Version Support
This library currently supports Python 3.8+ versions. As minor Python
versions are officially sunset by the Python org papermill will similarly
drop support in the future.## Usage
### Parameterizing a Notebook
To parameterize your notebook designate a cell with the tag `parameters`.
![enable parameters in Jupyter](docs/img/enable_parameters.gif)
Papermill looks for the `parameters` cell and treats this cell as defaults for the parameters passed in at execution time. Papermill will add a new cell tagged with `injected-parameters` with input parameters in order to overwrite the values in `parameters`. If no cell is tagged with `parameters` the injected cell will be inserted at the top of the notebook.
Additionally, if you rerun notebooks through papermill and it will reuse the `injected-parameters` cell from the prior run. In this case Papermill will replace the old `injected-parameters` cell with the new run's inputs.
![image](docs/img/parameters.png)
### Executing a Notebook
The two ways to execute the notebook with parameters are: (1) through
the Python API and (2) through the command line interface.#### Execute via the Python API
```{.sourceCode .python}
import papermill as pmpm.execute_notebook(
'path/to/input.ipynb',
'path/to/output.ipynb',
parameters = dict(alpha=0.6, ratio=0.1)
)
```#### Execute via CLI
Here's an example of a local notebook being executed and output to an
Amazon S3 account:```{.sourceCode .bash}
$ papermill local/input.ipynb s3://bkt/output.ipynb -p alpha 0.6 -p l1_ratio 0.1
```**NOTE:**
If you use multiple AWS accounts, and you have [properly configured your AWS credentials](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html), then you can specify which account to use by setting the `AWS_PROFILE` environment variable at the command-line. For example:```{.sourceCode .bash}
$ AWS_PROFILE=dev_account papermill local/input.ipynb s3://bkt/output.ipynb -p alpha 0.6 -p l1_ratio 0.1
```In the above example, two parameters are set: `alpha` and `l1_ratio` using `-p` (`--parameters` also works). Parameter values that look like booleans or numbers will be interpreted as such. Here are the different ways users may set parameters:
```{.sourceCode .bash}
$ papermill local/input.ipynb s3://bkt/output.ipynb -r version 1.0
```Using `-r` or `--parameters_raw`, users can set parameters one by one. However, unlike `-p`, the parameter will remain a string, even if it may be interpreted as a number or boolean.
```{.sourceCode .bash}
$ papermill local/input.ipynb s3://bkt/output.ipynb -f parameters.yaml
```Using `-f` or `--parameters_file`, users can provide a YAML file from which parameter values should be read.
```{.sourceCode .bash}
$ papermill local/input.ipynb s3://bkt/output.ipynb -y "
alpha: 0.6
l1_ratio: 0.1"
```Using `-y` or `--parameters_yaml`, users can directly provide a YAML string containing parameter values.
```{.sourceCode .bash}
$ papermill local/input.ipynb s3://bkt/output.ipynb -b YWxwaGE6IDAuNgpsMV9yYXRpbzogMC4xCg==
```Using `-b` or `--parameters_base64`, users can provide a YAML string, base64-encoded, containing parameter values.
When using YAML to pass arguments, through `-y`, `-b` or `-f`, parameter values can be arrays or dictionaries:
```{.sourceCode .bash}
$ papermill local/input.ipynb s3://bkt/output.ipynb -y "
x:
- 0.0
- 1.0
- 2.0
- 3.0
linear_function:
slope: 3.0
intercept: 1.0"
```#### Supported Name Handlers
Papermill supports the following name handlers for input and output paths during execution:
- Local file system: `local`
- HTTP, HTTPS protocol: `http://, https://`
- Amazon Web Services: [AWS S3](https://aws.amazon.com/s3/) `s3://`
- Azure: [Azure DataLake Store](https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-overview), [Azure Blob Store](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blobs-overview) `adl://, abs://`
- Google Cloud: [Google Cloud Storage](https://cloud.google.com/storage/) `gs://`
## Development Guide
Read [CONTRIBUTING.md](./CONTRIBUTING.md) for guidelines on how to setup a local development environment and make code changes back to Papermill.
For development guidelines look in the [DEVELOPMENT_GUIDE.md](./DEVELOPMENT_GUIDE.md) file. This should inform you on how to make particular additions to the code base.
## Documentation
We host the [Papermill documentation](http://papermill.readthedocs.io)
on ReadTheDocs.