{"id":20325414,"url":"https://github.com/getyourguide/db-rocket","last_synced_at":"2025-04-11T19:42:26.947Z","repository":{"id":37254822,"uuid":"344141713","full_name":"getyourguide/db-rocket","owner":"getyourguide","description":"Keep your local python scripts installed and in sync with a databricks notebook. Shortens the feedback loop to develop projects using a hybrid environment.","archived":false,"fork":false,"pushed_at":"2025-04-08T09:26:50.000Z","size":762,"stargazers_count":16,"open_issues_count":8,"forks_count":2,"subscribers_count":55,"default_branch":"main","last_synced_at":"2025-04-08T10:29:47.631Z","etag":null,"topics":["data-science","databricks","productivity","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/getyourguide.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":"CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-03-03T13:47:17.000Z","updated_at":"2025-04-08T09:26:50.000Z","dependencies_parsed_at":"2023-02-01T04:31:17.476Z","dependency_job_id":"7f6cddbe-b749-49f3-ab9b-ebc01f185656","html_url":"https://github.com/getyourguide/db-rocket","commit_stats":null,"previous_names":["getyourguide/databricks-rocket"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/getyourguide%2Fdb-rocket","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/getyourguide%2Fdb-rocket/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/getyourguide%2Fdb-rocket/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/getyourguide%2Fdb-rocket/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/getyourguide","download_url":"https://codeload.github.com/getyourguide/db-rocket/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248468533,"owners_count":21108836,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-science","databricks","productivity","python"],"created_at":"2024-11-14T19:39:45.312Z","updated_at":"2025-04-11T19:42:26.910Z","avatar_url":"https://github.com/getyourguide.png","language":"Python","readme":"## Databricks-Rocket\n\n\u003cimg src=\"https://user-images.githubusercontent.com/2252355/173396060-8ebb3a33-f389-421d-bea4-afc01a078307.svg\" width=\"100\" height=\"100\"\u003e\n\n[![PyPI version](https://badge.fury.io/py/databricks-rocket.svg)](https://badge.fury.io/py/databricks-rocket)\n![PyPI downloads](https://img.shields.io/pypi/dm/databricks-rocket)\n\nDatabricks-Rocket (short db-rockets), keeps your local Python scripts installed and synchronized with a Databricks notebook. Every change on your local machine\nis automatically reflected in the notebook. This shortens the feedback loop for developing git-based projects and\neliminates the need to set up a local development environment.\n\n## Installation\n\nInstall `databricks-rocket` using pip:\n\n```sh\npip install databricks-rocket\n```\n\n## Setup\n\nEnsure you've created a personal access token in\nDatabricks ([offical documentation](https://docs.databricks.com/dev-tools/cli/index.html)). Afterward, set up the\nDatabricks CLI by executing:\n\n```sh\ndatabricks configure --token\n```\n\nAlternatively, you can set the Databricks token and host in your environment variables:\n\n```sh\nexport DATABRICKS_HOST=\"mydatabrickshost\"\nexport DATABRICKS_TOKEN=\"mydatabrickstoken\"\n```\n\nIf your project isn't already a pip package, you'll need to convert it into one. Use dbrocket for this:\n\n```sh\nrocket setup\n```\n\nWill create a setup.py for you.\n\n## Usage\n\n### To Sync Your Project\n\nBy default, `databricks-rocket` syncs your project to DBFS automatically. This allows you to update your code and have\nthose changes reflected in your Databricks notebook without restarting the Python kernel. Simply execute:\n\n```sh\nrocket launch\n```\n\nYou'll then receive the exact command to run in your notebook. Example:\n\n```sh\nstevenmi@MacBook db-rocket % rocket launch --watch=False\n\u003e\u003e Watch activated. Uploaded your project to databricks. Install your project in your databricks notebook by running:\n\u003e\u003e %pip install --upgrade pip\n\u003e\u003e %pip install  -r /dbfs/temp/stevenmi/db-rocket/requirements.txt\n\u003e\u003e %pip install --no-deps -e /dbfs/temp/stevenmi/db-rocket\n\nand following in a new Python cell:\n\u003e\u003e %load_ext autoreload\n\u003e\u003e %autoreload 2\n```\n\nFinally, add the content in you databricks notebook:\n![imgs/img_2.png](imgs/img_2.png)\n\n#### Include non-python files\nUpload all root level json files:\n```shell\nrocket launch --glob_path=\"*,json\"\n```\nOn top also upload all env files:\n```shell\nrocket launch --glob_path=\"[\\\"*.json\\\", \\\".env*\\\"]\"\n```\nWhen specifying lists, be mindful about the formatting of the parameter string.\n\n### To Upload Your Python Package\n\nIf you've disabled the watch feature, `databricks-rocket` will only upload your project as a wheel to DBFS:\n\n```sh\nrocket launch --watch=False\n```\n\nExample:\n\n```sh\nstevenmi@MacBook db-rocket % rocket launch --watch=False\n\u003e\u003e Watch is disabled. Building creating a python wheel from your project\n\u003e\u003e Found setup.py. Building python library\n\u003e\u003e Uploaded ./dist/databricks_rocket-2.0.0-py3-none-any.whl to dbfs:/temp/stevenmi/db-rocket/dist/databricks_rocket-2.0.0-py3-none-any.whl\n\u003e\u003e Uploaded wheel to databricks. Install your library in your databricks notebook by running:\n\u003e\u003e %pip install --upgrade pip\n\u003e\u003e %pip install  /dbfs/temp/stevenmi/db-rocket/databricks_rocket-2.0.0-py3-none-any.whl --force-reinstall\n```\n\n## Blogposts\n\n- [DBrocket 2.0](https://www.getyourguide.careers/posts/improving-data-science-productivity-with-db-rocket-2-0): A summary of the big improvements we made to the tool in the new release.\n- [DB Rocket 1.0](https://www.getyourguide.careers/posts/open-sourcing-db-rocket-for-data-scientists) post also gives more details about the rationale around dbrocket.\n\n## Support\n\n- Databricks: \u003e=7\n- Python: \u003e=3.7\n- Tested on Platform: Linux, MacOs. Windows will probably not work but contributions are welcomed!\n- Supports uploading to Unity Catalog Volumes starting from version 3.0.0. Note that the underlying dependency, `databricks-sdk`, is still in beta. We do not recommend using UC Volumes in production.\n\n## Acknowledgments\n\n- Thanks Leon Poli for the Logo :)\n- Thanks Stephane Leonard for source-code and documentation improvements :)\n- Thanks Malachi Soord for the CICD setup and README improvements\n\nContributions are welcomed!\n\n\n# Security\n\nFor security issues please contact [security@getyourguide.com](mailto:security@getyourguide.com).\n\n# Legal\n\ndb-rocket is licensed under the Apache License, Version 2.0. See [LICENSE](LICENSE.txt) for the full text.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgetyourguide%2Fdb-rocket","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgetyourguide%2Fdb-rocket","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgetyourguide%2Fdb-rocket/lists"}