https://github.com/databricks/notebook-best-practices
An example showing how to apply software engineering best practices to Databricks notebooks.
https://github.com/databricks/notebook-best-practices
Last synced: about 1 month ago
JSON representation
An example showing how to apply software engineering best practices to Databricks notebooks.
- Host: GitHub
- URL: https://github.com/databricks/notebook-best-practices
- Owner: databricks
- License: apache-2.0
- Created: 2022-05-16T17:04:42.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2024-07-24T18:09:51.000Z (over 1 year ago)
- Last Synced: 2026-01-25T20:45:09.213Z (2 months ago)
- Language: Python
- Homepage:
- Size: 737 KB
- Stars: 149
- Watchers: 13
- Forks: 73
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Software engineering best practices for Databricks notebooks
This repository is a companion for the example article "Software engineering best practices for Databricks notebooks" ([AWS](https://docs.databricks.com/notebooks/best-practices.html) | [Azure](https://docs.microsoft.com/azure/databricks/notebooks/best-practices) | [GCP](https://docs.gcp.databricks.com/notebooks/best-practices.html)).
Going through the example, you will:
* Add notebooks to Databricks Repos for version control.
* Extracts portions of code from one of the notebooks into a shareable component.
* Test the shared code.
* Automatically run notebooks in git on a schedule using a Databricks job.
* Optionally, apply CI/CD to the notebooks and the shared code.
The example is hands-on. We recommend working it step-by-step to learn how to apply these techniques to your own Databricks notebooks.