{"id":13487292,"url":"https://github.com/vector4wang/quick-spark-process","last_synced_at":"2025-03-21T04:30:24.829Z","repository":{"id":37177669,"uuid":"130245478","full_name":"vector4wang/quick-spark-process","owner":"vector4wang","description":":star2::star2::star2:学习spark的相关示例","archived":false,"fork":false,"pushed_at":"2022-11-15T23:51:47.000Z","size":753,"stargazers_count":38,"open_issues_count":5,"forks_count":34,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-10-12T07:38:11.171Z","etag":null,"topics":["java","spark","springboot-spark"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vector4wang.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-04-19T16:46:29.000Z","updated_at":"2024-09-26T15:31:12.000Z","dependencies_parsed_at":"2023-01-20T17:38:58.700Z","dependency_job_id":null,"html_url":"https://github.com/vector4wang/quick-spark-process","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vector4wang%2Fquick-spark-process","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vector4wang%2Fquick-spark-process/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vector4wang%2Fquick-spark-process/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vector4wang%2Fquick-spark-process/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vector4wang","download_url":"https://codeload.github.com/vector4wang/quick-spark-process/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":221811374,"owners_count":16884305,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["java","spark","springboot-spark"],"created_at":"2024-07-31T18:00:57.414Z","updated_at":"2024-10-28T09:18:38.256Z","avatar_url":"https://github.com/vector4wang.png","language":"Java","funding_links":[],"categories":["Java"],"sub_categories":[],"readme":"# quick-spark-process\n学习spark的相关示例\n\n[![LICENSE](https://img.shields.io/badge/license-Anti%20996-blue.svg)](https://github.com/996icu/996.ICU/blob/master/LICENSE)\n\n\n### word-count\n最简单也是最经典的例子\n后面搭了spark集群 并使用了hdfs来存储文件，有几点需要注意\n#### 文件的调用方式\n```java\ncontext.textFile(\"D:\\\\data\\\\spark\\\\blsmy.txt\");  -- 用于idea测试\ncontext.textFile(\"file:///mnt/data/blsmy.txt\"); -- 用于集群运行(前提，运行的各节点都需要有此文件)\ncontext.textFile(\"hdfs://spark-master:9000/wordcount/blsmy.txt\"); -- 使用hdfs调用文件\n```\n#### 日志输出的位置\n在页面中，输出的日志有sterr和stdout两种，在stdout可以查看程序中输出的内容。如果你在程序中使用了println(....)输出语句，这些信息会在stdout文件里面显示；其余的Spark运行日志会在stderr文件里面显示。\n也可以直接进行日志文件进行查看，如：\n```bash\n/spark/software/spark/work/app-20180428142302-0003/0/stdout\n/spark/software/spark/work/app-20180428142302-0003/0/stderr\n```\n#### 启动的方式\n```bash\nbin/spark-submit \\ \n    --master spark://spark-master:7077 \\\n    --driver-memory 1g \\\n    --executor-cores 1 \\\n    --class com.spark.WordCount \\\n    simple/word-count-1.0-SNAPSHOT.jar\n```\n\n\n\n\n### spark-pi\n也是一个比较经典的栗子\n\n### spark-sql\n使用sparksql做的简单操作\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvector4wang%2Fquick-spark-process","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvector4wang%2Fquick-spark-process","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvector4wang%2Fquick-spark-process/lists"}