Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/duhanmin/datax-on-yarn
实现yarn客户端,datax-on-yarn可以让datax在yarn master上运行
https://github.com/duhanmin/datax-on-yarn
datax yarn
Last synced: about 2 months ago
JSON representation
实现yarn客户端,datax-on-yarn可以让datax在yarn master上运行
- Host: GitHub
- URL: https://github.com/duhanmin/datax-on-yarn
- Owner: duhanmin
- Created: 2021-07-03T09:32:45.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2023-11-14T08:30:08.000Z (about 1 year ago)
- Last Synced: 2023-11-14T09:33:45.359Z (about 1 year ago)
- Topics: datax, yarn
- Language: Java
- Homepage:
- Size: 13.4 MB
- Stars: 15
- Watchers: 3
- Forks: 11
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# datax-on-yarn
datax-on-yarn可以让datax在yarn master上运行
## 提交方式
### shell
* datax_home_hdfs datax在hdfs的安装包
* datax_job 配置json
* yarn master_memory内存为yarn master与datax job内存之和```shell
/usr/bin/yarn jar /mnt/dss/211/datax-on-yarn-1.0.0.jar com.on.yarn.Client \
-jar_path /mnt/dss/211/datax-on-yarn-1.0.0.jar \
-appname datax-job \
-master_memory 1024 \
-p dt=20200324,pt=20200324 \
-queue default \
-proxy_user hanmin.du \
-datax_job /mnt/dss/datax/job/t2.json \
-datax_home_hdfs /tmp/linkis/hadoop/datax.tar.gz
```### sdk api(scala)
* JobLogger类型重写com.on.yarn.base.YarnManipulator日志输出接口
* dataxJob中传入运行参数
* 引入以下依赖```xml
com.on.yarn
datax-client
1.0.0```
```scala
val jobLogger = new JobLogger(job)
var client: Client = null
try {
val cmd = dataxJob.toStrinArray
jobLogger.info("------------------运行参数: " + ArrayUtil.toString(cmd))
client = new Client(jobLogger)
if (!client.init(cmd)) throw new RuntimeException("参数初始化异常: " + dataxJob)
val applicationId: ApplicationId = client.run
appId = applicationId.toString
jobLogger.info("------------------DataX yarn id: " + applicationId.toString)
val result = client.monitorApplication(applicationId)
if (result) jobLogger.info("Application completed successfully")
else throw new RuntimeException("任务运行异常,详见日志,AppID: " + applicationId)
} catch {
case e: Exception => {
jobLogger.info(ExceptionUtil.stacktraceToString(e))
}
} finally {
if (null != client) client.stop()
}```
## 运行示例
![image](https://user-images.githubusercontent.com/28647031/181469603-e864c064-2b4c-4e0c-92d2-9cb9435435aa.png)