{"id":19643751,"url":"https://github.com/logicalclocks/spark-chef","last_synced_at":"2025-04-28T13:31:07.475Z","repository":{"id":24916619,"uuid":"28333516","full_name":"logicalclocks/spark-chef","owner":"logicalclocks","description":"Apache Spark chef cookbook","archived":false,"fork":false,"pushed_at":"2024-11-05T22:24:32.000Z","size":517,"stargazers_count":3,"open_issues_count":0,"forks_count":32,"subscribers_count":12,"default_branch":"master","last_synced_at":"2025-04-05T09:04:47.859Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Ruby","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/logicalclocks.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2014-12-22T09:59:33.000Z","updated_at":"2024-06-14T08:56:04.000Z","dependencies_parsed_at":"2023-02-10T22:01:43.023Z","dependency_job_id":"e8fdcd49-4663-414d-9c5b-f04a30245f02","html_url":"https://github.com/logicalclocks/spark-chef","commit_stats":null,"previous_names":[],"tags_count":15,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/logicalclocks%2Fspark-chef","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/logicalclocks%2Fspark-chef/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/logicalclocks%2Fspark-chef/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/logicalclocks%2Fspark-chef/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/logicalclocks","download_url":"https://codeload.github.com/logicalclocks/spark-chef/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251319748,"owners_count":21570450,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-11T14:23:43.348Z","updated_at":"2025-04-28T13:31:02.384Z","avatar_url":"https://github.com/logicalclocks.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Apache Spark Chef cookbook\n\n### Install Spark standalone\n\n### Install Spark yarn\n\n\n## References\n\n * https://documentation.altiscale.com/spark-2-0-with-altiscale\n * https://www.linkedin.com/pulse/running-spark-2xx-cloudera-hadoop-distro-cdh-deenar-toraskar-cfa\n * \n\n\n\n set \"spark.yarn.jars\"\n$ Cd  $SPARK_HOME\n$ hadoop fs mkdir spark-2.0.0-bin-hadoop \n$hadoop fs -copyFromLocal jars/* spark-2.0.0-bin-hadoop \n$ echo \"spark.yarn.jars=hdfs:///nameservice1/user/\u003cyourusername\u003e/spark-2.0.0-bin-hadoop/*\" \u003e\u003e conf/spark-defaults.conf\n\n\nIf you do have access to the local directories of all the nodes in your cluster you can copy the archive or spark jars to the local directory of each of the data nodes using rsync or scp. Just update the URLs from hdfs:/ to local:\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flogicalclocks%2Fspark-chef","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flogicalclocks%2Fspark-chef","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flogicalclocks%2Fspark-chef/lists"}