{"id":18400679,"url":"https://github.com/databricks/sbt-spark-package","last_synced_at":"2025-10-18T22:55:34.731Z","repository":{"id":27429520,"uuid":"30907323","full_name":"databricks/sbt-spark-package","owner":"databricks","description":"Sbt plugin for Spark packages","archived":false,"fork":false,"pushed_at":"2018-01-10T23:30:29.000Z","size":74,"stargazers_count":152,"open_issues_count":20,"forks_count":32,"subscribers_count":18,"default_branch":"master","last_synced_at":"2025-04-03T00:59:00.054Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/databricks.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-02-17T07:44:57.000Z","updated_at":"2025-02-28T08:14:12.000Z","dependencies_parsed_at":"2022-07-24T15:02:01.464Z","dependency_job_id":null,"html_url":"https://github.com/databricks/sbt-spark-package","commit_stats":null,"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databricks%2Fsbt-spark-package","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databricks%2Fsbt-spark-package/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databricks%2Fsbt-spark-package/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databricks%2Fsbt-spark-package/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/databricks","download_url":"https://codeload.github.com/databricks/sbt-spark-package/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247607769,"owners_count":20965945,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-06T02:35:59.492Z","updated_at":"2025-10-18T22:55:34.651Z","avatar_url":"https://github.com/databricks.png","language":"Scala","readme":"sbt-spark-package [![Build Status](https://travis-ci.org/databricks/sbt-spark-package.svg)](http://travis-ci.org/databricks/sbt-spark-package)\n==================\n\n*Sbt Plugin for Spark Packages*\n\nsbt-spark-package is a Sbt plugin that aims to simplify the use and development of Spark Packages.\n\n**Please upgrade to version 0.2.4+ as spark-packages now supports SSL**.\n\nRequirements\n============\n\n* sbt\n\nSetup\n=====\n\n### The sbt way\n\nSimply add the following to `\u003cyour_project\u003e/project/plugins.sbt`:\n```scala\n  resolvers += \"bintray-spark-packages\" at \"https://dl.bintray.com/spark-packages/maven/\"\n\n  addSbtPlugin(\"org.spark-packages\" % \"sbt-spark-package\" % \"0.2.6\")\n```\n\nUsage\n=====\n\nSpark Package Developers\n------------------------\n\nIn your `build.sbt` file include the appropriate values for:\n\n * `spName := \"organization/my-awesome-spark-package\" // the name of your Spark Package`\n\nPlease specify any Spark dependencies using `sparkVersion` and `sparkComponents`. For example:\n\n * `sparkVersion := \"2.1.0\" // the Spark Version your package depends on.`\n\n Spark Core will be included by default if no value for `sparkComponents` is supplied. You can add sparkComponents as:\n\n * `sparkComponents += \"mllib\" // creates a dependency on spark-mllib.`\n\n or\n\n * `sparkComponents ++= Seq(\"streaming\", \"sql\")`\n\nYou can make a zip archive ready for a release on the Spark Packages website by simply calling\n`sbt spDist`. This command will include any python files related to your package in the\n jar inside this archive. When this jar is added to your PYTHONPATH, you will be able to use your\n Python files.\n\nBy default, the zip file will be produced in `\u003cproject\u003e/target`, but you can\noverride this by providing a value for `spDistDirectory` like:\n\n`spDistDirectory := \"Users\" / \"foo\" / \"Documents\" / \"bar\"`\n\nThe slashes should still remain as slashes on a Windows system, don't switch them to backslashes.\n\nYou may publish your package locally for testing with `sbt spPublishLocal`.\n\nIn addition, `sbt console` will create you a Spark Context for testing your code like the spark-shell.\n\nIf you want to make a release of your package against multiple Scala versions (e.g. 2.10, 2.11),\nyou may set `spAppendScalaVersion := true` in your build file.\n\nIn any case where you really can't specify Spark dependencies using `sparkComponents` (e.g. you have\nexclusion rules) and configure them as `provided` (e.g. standalone jar for a demo), you may use\n `spIgnoreProvided := true` to properly use the `assembly` plugin.\n\n### Including shaded dependencies\n\nSometimes you may require shading for your package to work in certain environments. sbt-spark-package\nsupports publishing shaded dependencies built through the sbt-assembly plugin. To achieve this,\nyou will need two projects, one for building the shaded dependency, and one for building the\ndistribution ready package.\n\n```scala\nlazy val shaded = Project(\"shaded\", file(\".\")).settings(\n  libraryDependencies ++= (dependenciesToShade ++\n    nonShadedDependencies.map(_ % \"provided\")), // don't include any other dependency in your assembly jar\n  target := target.value / \"shaded\", // have a separate target directory to make sbt happy\n  assemblyShadeRules in assembly := Seq(\n    ShadeRule.rename(\"blah.**\" -\u003e \"bleh.@1\").inAll\n  )\n) // add all other settings\n\nlazy val distribute = Project(\"distribution\", file(\".\")).settings(\n  spName := ... // your spark package name\n  target := target.value / \"distribution\",\n  spShade := true, // THIS IS THE MOST IMPORTANT SETTING\n  assembly in spPackage := (assembly in shaded).value, // this will pick up the shaded jar for distribution\n  libraryDependencies := nonShadedDependencies // have all your non shaded dependencies here so that we can\n                                               // generate a clean pom.\n) // add all other settings\n```\n\nNow you may use `distribution/spDist` to build your zip file, or `distribution/spPublish` to publish a\nnew release. For more details on publishing, please refer to the next section.\n\n### Registering and publishing Spark Packages\n\n*credentials*\n\nIn order to use `spRegister` or `spPublish` to register or publish a release of your Spark Package,\nyou have to specify your Github credentials. You may specify your credentials through a file (recommended)\nor directly in your build file like below:\n\n```scala\ncredentials += Credentials(Path.userHome / \".ivy2\" / \".sbtcredentials\") // A file containing credentials\n\ncredentials += Credentials(\"Spark Packages Realm\",\n                           \"spark-packages.org\",\n                           s\"$GITHUB_USERNAME\",\n                           s\"GITHUB_PERSONAL_ACCESS_TOKEN\")\n```\n\nMore can be found in the [sbt documentation](http://www.scala-sbt.org/0.13/docs/Publishing.html#Credentials).\n\nUsing these functions require \"read:org\" Github access to authenticate ownership of the repo. Documentation\nto generate a Github Personal Access Token can be found\n[here](https://help.github.com/articles/creating-an-access-token-for-command-line-use/).\n\n*spRegister*\n\nYou can register your Spark Package for the first time using this plugin with the command `sbt spRegister`.\nIn order to register your package, you must have logged in to the Spark Packages website at least once\nand supply values for the following settings in your build file:\n\n```scala\nspShortDescription := \"My awesome Spark Package\" // Your one line description of your package\n\nspDescription := \"\"\"My long description.\n                    |Could be multiple lines long.\n                    | - My package can do this,\n                    | - My package can do that.\"\"\".stripMargin\n\ncredentials += // Your credentials, see above.\n```\n\nThe homepage of your package is by default the web page for the Github repository. You can change the default\nhomepage by using:\n\n```scala\nspHomepage := // Set this if you want to specify a web page other than your github repository.\n```\n\n*spPublish*\n\nYou can publish a new release using `sbt spPublish`. The HEAD commit on your local repository will be\nused as the git commit sha for your release. Therefore, please make sure that your local commit is\nindeed the version you would like to make a release for, and that you have pushed that commit to the\nmaster branch on your remote.\n\nThe required settings for `spPublish` are:\n\n```scala\n// You must have an Open Source License. Some common licenses can be found in: http://opensource.org/licenses\nlicenses += \"Apache-2.0\" -\u003e url(\"http://opensource.org/licenses/Apache-2.0\")\n\n// If you published your package to Maven Central for this release (must be done prior to spPublish)\nspIncludeMaven := true\n\ncredentials += // Your credentials, see above.\n```\n\n\nSpark Package Users\n-------------------\n\nAny Spark Packages your package depends on can be added as:\n\n * `spDependencies += \"databricks/spark-avro:0.1\" // format is spark-package-name:version`\n\nWe also recommend that you use `sparkVersion` and `sparkComponents` to manage your Spark dependencies.\nIn addition, you can use `sbt assembly` to create an uber jar of your project.\n\nContributions\n=============\n\nIf you encounter bugs or want to contribute, feel free to submit an issue or pull request.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatabricks%2Fsbt-spark-package","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdatabricks%2Fsbt-spark-package","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatabricks%2Fsbt-spark-package/lists"}