Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/huangyueranbbc/SparkDemo
spark全示例代码(java、scala) Spark most full instance code DEMO (java、scala)
https://github.com/huangyueranbbc/SparkDemo
bigdata hadoop operator spark spark-sql spark-streaming sparkfun-products sparkjava sparkline sparkp
Last synced: 12 days ago
JSON representation
spark全示例代码(java、scala) Spark most full instance code DEMO (java、scala)
- Host: GitHub
- URL: https://github.com/huangyueranbbc/SparkDemo
- Owner: huangyueranbbc
- License: mit
- Created: 2017-08-03T11:56:32.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2020-05-09T06:04:12.000Z (over 4 years ago)
- Last Synced: 2024-08-01T18:29:43.275Z (3 months ago)
- Topics: bigdata, hadoop, operator, spark, spark-sql, spark-streaming, sparkfun-products, sparkjava, sparkline, sparkp
- Language: Java
- Homepage:
- Size: 2.33 MB
- Stars: 81
- Watchers: 5
- Forks: 68
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# SparkDEMO [![Travis](https://img.shields.io/badge/SparkDemo-v1.0-yellowgreen.svg)](https://github.com/huangyueranbbc/SparkDemo) [![Travis](https://img.shields.io/badge/Spark-API-green.svg)](http://spark.apache.org/docs/latest/api.html) [![Travis](https://img.shields.io/badge/Apache-Spark-yellowgreen.svg)](http://spark.apache.org/) [![Travis](https://img.shields.io/badge/Spark-ALS-blue.svg)](https://github.com/huangyueranbbc/Spark_ALS)
1. 包含Spark所有的操作
a.包含官方的ml、mllib、streaming、sql等操作DEMO
b.包含所有常用算子的操作DEMO
2. 已修正为maven版本
3. 有详细的中英注释4. spark-api版本更新至新版
5. 增加scala-spark
2019年08月06日
Spark operation DEMO
1. Include all Spark operations
A. Contains official operations such as ml, mllib, streaming, sql, etc. DEMO
B. Operation DEMO containing all common operators2. Modified to Maven version
3. Detailed Chinese and English annotations
4. Spark-api version updated to new version
5. scala-spark
2019-08-06
## 远程执行模式
1. 根据需要,将/data目录下的文件上传到hdfs相同的目录下
------data
------------mllib
------------resources2. mvn package生成jar包。指定jar包文件地址。
conf.setJars(Array[String]("/Users/huangyueran/ideaworkspaces1/myworkspaces/spark/SparkDemo/target/SparkDemo-1.0-SNAPSHOT-jar-with-dependencies.jar"))
3. 通过SparkUtils选择运行模式
JavaSparkContext sc = SparkUtils.getLocalSparkContext(TestStorageLevel.class);
JavaSparkContext sc = SparkUtils.getRemoteSparkContext(TestStorageLevel.class);
4. 使用远程模式,添加集群配置文件到resources目录下
core-site.xml
hdfs-site.xml
yarn-site.xml
5. 如果需要加载文件,根据运行模式选择文件加载方式。
JavaRDD text = sc.textFile(Constant.LOCAL_FILE_PREX +"/data/resources/test.txt");
JavaRDD text = sc.textFile(/data/resources/test.txt");