{"id":16388654,"url":"https://github.com/phact/dse-spark-rest-api","last_synced_at":"2025-11-13T14:01:29.146Z","repository":{"id":71205786,"uuid":"112261114","full_name":"phact/dse-spark-rest-api","owner":"phact","description":null,"archived":false,"fork":false,"pushed_at":"2018-10-18T18:15:22.000Z","size":220,"stargazers_count":0,"open_issues_count":6,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-02-22T15:58:27.684Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/phact.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-11-27T23:31:03.000Z","updated_at":"2017-11-27T23:32:12.000Z","dependencies_parsed_at":"2023-06-27T05:18:34.733Z","dependency_job_id":null,"html_url":"https://github.com/phact/dse-spark-rest-api","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/phact/dse-spark-rest-api","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/phact%2Fdse-spark-rest-api","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/phact%2Fdse-spark-rest-api/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/phact%2Fdse-spark-rest-api/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/phact%2Fdse-spark-rest-api/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/phact","download_url":"https://codeload.github.com/phact/dse-spark-rest-api/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/phact%2Fdse-spark-rest-api/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":284227494,"owners_count":26968592,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-11-13T02:00:06.582Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-11T04:29:36.736Z","updated_at":"2025-11-13T14:01:29.130Z","avatar_url":"https://github.com/phact.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Spark Submit Options\n\nThis is a guide for how to submit spark jobs through the various API options available and their pros and cons.\n\n### Motivation\n\nDSE Analytics jobs (both bulk and streaming) must be submitted to the cluster in order to be executed.\n\nDepending on user requirements, there are a few different ways of doing so. This asset describes the pros and cons of each.\n\n### What is included?\n\nThis field asset includes a simple batch analytics application and describes how to use the:\n\n* `dse spark-submit`\n* Undocumented Spark RESTFUL API\n\nTo submit it to the cluster. Other alternatives for execution include\n* `dse spark-client`\n* spark job server\n\n### Business Take Aways\n\nIn analytics use cases, Business stakeholders depend on timely and trackable runs of their business logic to satisfy analytical requirements. This asset helps their technical counterparts support these needs.\n\n### Technical Take Aways\n\nThe preferred method for submitting spark applications (whether remotely using `dse client-tool` or from the cluster itself) is `dse spark-submit`. DSE takes care of setting environmental variables and identifying the Spark master for application submission automatically simplifying availability requirements of spark applications.\n\nHowever, some users require the ability to submit spark jobs remotely via REST. In these cases, customers often find out about job server and incur the complexity that goes along with job server to achieve REST submission. In some cases the undocumented Spark REST api is sufficient to meet the requirement.\nHowever, be aware that the Spark REST API is not supported by DataStax or by\nthe spark community.\n\nNote: Starting with DSE 5.1 we are able to automatically find the Master for job submissions and allow the selection of a local datacenter for a spark job by using the `dse://` syntax in the `.master` property. The implementation of `dse://` has broken compatibility with the Spark RESTful API in 5.1.0.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fphact%2Fdse-spark-rest-api","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fphact%2Fdse-spark-rest-api","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fphact%2Fdse-spark-rest-api/lists"}