{"id":13818033,"url":"https://github.com/aaronshan/presto-third-functions","last_synced_at":"2026-02-02T13:41:06.050Z","repository":{"id":83625628,"uuid":"62708764","full_name":"aaronshan/presto-third-functions","owner":"aaronshan","description":"Some useful presto custom udf functions","archived":false,"fork":false,"pushed_at":"2017-12-13T09:18:29.000Z","size":154,"stargazers_count":54,"open_issues_count":1,"forks_count":41,"subscribers_count":10,"default_branch":"master","last_synced_at":"2024-11-19T16:43:24.756Z","etag":null,"topics":["preto","udf"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aaronshan.png","metadata":{"files":{"readme":"README-zh.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2016-07-06T09:28:09.000Z","updated_at":"2023-04-08T05:38:50.000Z","dependencies_parsed_at":null,"dependency_job_id":"ecfcbcc1-1f34-4fae-9321-cc449e58207c","html_url":"https://github.com/aaronshan/presto-third-functions","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aaronshan%2Fpresto-third-functions","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aaronshan%2Fpresto-third-functions/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aaronshan%2Fpresto-third-functions/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aaronshan%2Fpresto-third-functions/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aaronshan","download_url":"https://codeload.github.com/aaronshan/presto-third-functions/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254433787,"owners_count":22070527,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["preto","udf"],"created_at":"2024-08-04T07:00:28.660Z","updated_at":"2026-02-02T13:41:06.009Z","avatar_url":"https://github.com/aaronshan.png","language":"Java","funding_links":[],"categories":["Java"],"sub_categories":[],"readme":"# presto-third-functions \n\n[![Build Status](https://travis-ci.org/aaronshan/presto-third-functions.svg?branch=master)](https://travis-ci.org/aaronshan/presto-third-functions)\n[![Documentation Status](https://img.shields.io/badge/docs-latest-brightgreen.svg?style=flat)](https://github.com/aaronshan/presto-third-functions/tree/master/README.md)\n[![Documentation Status](https://img.shields.io/badge/中文文档-最新-brightgreen.svg)](https://github.com/aaronshan/presto-third-functions/tree/master/README-zh.md)\n[![Release](https://img.shields.io/github/release/aaronshan/presto-third-functions.svg)](https://github.com/aaronshan/presto-third-functions/releases)\n\n\n## 简介\n\n包含了一些presto自定义的函数\n\n## 构建\n### 各软件版本:\n* Java 8 Update 60 及以上\n* Maven 3.3.9+\n\n### 命令\n```\ncd ${project_home}\nmvn clean package\n```\n\n如果想要忽略单元测试,请执行:\n```\nmvn clean package -DskipTests\n```\n执行完命令后,将会生成在target目录下presto-third-functions-{version}-shaded.jar`文件.\n\n或者也可以直接在[发布页](https://github.com/aaronshan/presto-third-functions/releases)下载.\n\n### 版本说明\n| 版本 | 说明 |\n|:--|:--|\n| `0.2.0` | 支持`presto-0.147`~`presto-0.149`|\n| `0.3.0` | 支持`presto-0.150`~`presto-0.151`|\n| `0.4.0` | 支持`presto-0.152`|\n| `0.5.0` | 支持`presto-0.153`~`presto-0.166`|\n| `0.5.1` | 支持`presto-0.167`~`presto-0.168`|\n\n## 函数\n### 1. 字符串相关函数\n| 函数| 说明|\n|:--|:--|\n|pinyin(string) -\u003e string | 将汉字转为拼音|\n|md5(string) -\u003e string |对字符串求md5值|\n|sha256(string) -\u003e string |对字符串求sha256值|\n\n### 2. 日期相关函数\n| 函数| 说明|\n|:--|:--|\n|dayofweek(date_string \\| date) -\u003e int |计算给定日期是每周7天内的第几天,其中周一返回1,周天返回7,错误返回-1.|\n|zodiac(date_string \\| date) -\u003e string | 将日期转换为星座英文 |\n|zodiac_cn(date_string \\| date) -\u003e string | 将日期转换为星座中文 | \n|typeofdate(date_string \\| date) -\u003e string | 获取日期的类型(1: 法定节假日, 2: 正常周末, 3: 正常工作日 4:攒假的工作日),错误返回-1. | \n\n### 3. 数组相关函数\n| 函数| 说明|\n|:--|:--|\n|array_union(array, array) -\u003e array |求两个array的并集|\n|value_count(array(T), T value) -\u003e int | 统计在数组中值为给定值的元素个数|\n\n\u003e 我已经发起了一个`array_union`的[PR](https://github.com/prestodb/presto/pull/5644#event-729329053), 现在它已经被合并到presto的master分支中. 因此,如果你的presto版本 \u003e 0.151,它已经包含了`array_union`函数.\n\n`0.3.0`版本以后为了兼容`presto-0.150`版本,也为了防止和`presto-0.151+`命名冲突,将该方法改名为arr_union. (从 `0.5.0`开始, 我删除了 `arr_union` 函数, 请使用 `array_union`函数代替.)\n\n### 4. JSON相关函数\n| 函数| 说明|\n|:--|:--|\n|json_array_extract(json, jsonPath) -\u003e array(varchar) |提取json数组中对应路径的值|\n|json_array_extract_scalar(json, jsonPath) -\u003e array(varchar) |和`json_array_extract`类似,但是返回结果是string(不是json格式)|\n\n### 5. MAP相关函数\n| 函数| 说明|\n|:--|:--|\n|value_count(MAP(K,V), V value) -\u003e int | 统计中MAP中值为给定值的元素的个数|\n\n### 6. 身份证相关函数\n| 函数| 说明|\n|:--|:--|\n|id_card_province(string) -\u003e string |由身份证号获取省份|\n|id_card_city(string) -\u003e string |由身份证号获取城市|\n|id_card_area(string) -\u003e string |由身份证号获取区或县|\n|id_card_birthday(string) -\u003e string |由身份证号获取出生日期|\n|id_card_gender(string) -\u003e string |由身份证号获取性别|\n|is_valid_id_card(string) -\u003e boolean |鉴别是否是有效的身份证号|\n|id_card_info(string) -\u003e json |获取身份证号对应的信息,包括省份,城市,区县,性别及是否有效|\n\n### 7. 坐标相关函数\n| 函数| 说明|\n|:--|:--|\n|wgs_distance(double lat1, double lng1, double lat2, double lng2) -\u003e double |计算WGS84坐标系下的坐标距离,单位为米|\n|gcj_to_bd(double,double) -\u003e json |火星坐标系(GCJ-02)转百度坐标系(BD-09),谷歌、高德——\u003e百度|\n|bd_to_gcj(double,double) -\u003e json |百度坐标系(BD-09)转火星坐标系(GCJ-02),百度——\u003e谷歌、高德|\n|wgs_to_gcj(double,double) -\u003e json |WGS84转GCJ02(火星坐标系)|\n|gcj_to_wgs(double,double) -\u003e json |GCJ02(火星坐标系)转GPS84,输出的WGS-84坐标精度为1米到2米之间。|\n|gcj_extract_wgs(double,double) -\u003e json |GCJ02(火星坐标系)转GPS84,输出的WGS-84坐标精度为0.5米内。但是计算速度慢于gcj_to_wgs|\n\n\u003e 关于互联网地图坐标系的说明见: [当前互联网地图的坐标系现状](https://github.com/aaronshan/presto-third-functions/tree/master/src/main/java/cc/shanruifeng/functions/udfs/scalar/geographic/README-geo.md)\n\n### 8. 其他函数\n| 函数| 说明|\n|:--|:--|\n|is_null(all_type) -\u003e boolean |是否是null|\n\n## 用法\n\n把presto-third-functions-{version}-shaded.jar放到 `${presto_home}/plugin/hive-hadoop2` 目录下并重启presto.下面是示例:\n### 1. 重启presto\n```\nmv presto-third-functions-{version}-shaded.jar /home/presto/presto-server-0.147/plugin/hive-hadoop2/\ncd /home/presto/presto-server-0.147\nbin/launcher restart\n```\n\n### 2. 设置presto命令\n```\ncd /home/presto/\nln -s presto-client-0.147/presto-cli-0.147-executable.jar presto-cli\nexport JAVA_HOME=/home/java8/jdk1.8.0_91/;\nexport PATH=/home/java8/jdk1.8.0_91/bin/:$PATH;\nalias presto=\"/home/presto/presto-cli --server localhost:8080 --catalog hive --schema default\"\n```\n\n### 3. 示例\n#### 3.1 字符串相关函数\n```\npresto:default\u003e select pinyin(country) from (values '中国') as t(country);\n  _col0\n----------\n zhongguo\n(1 row)\n\nQuery 20160707_073649_00006_iya2r, FINISHED, 1 node\nSplits: 1 total, 0 done (0.00%)\n0:00 [0 rows, 0B] [0 rows/s, 0B/s]\n```\n\n\n```\npresto:default\u003e select md5(col1), sha256(col1) from (values 'aaronshan') as t(col1)\\G;\n-[ RECORD 1 ]-----------------------------------------------------------\n_col0 | 95686bc0483262afe170b550dd4544d1\n_col1 | d16bb375433ad383169f911afdf45e209eabfcf047ba1faebdd8f6a0b39e0a32\n\nQuery 20160712_071936_00006_hkbes, FINISHED, 1 node\nSplits: 1 total, 0 done (0.00%)\n0:00 [0 rows, 0B] [0 rows/s, 0B/s]\n```\n\n#### 3.2 日期相关函数\n```\npresto\npresto:default\u003e select dayofweek(my_day) from (values '2016-07-07') as t(my_day);\n _col0\n-------\n     4\n(1 row)\n\nQuery 20160707_073523_00005_iya2r, FINISHED, 1 node\nSplits: 1 total, 0 done (0.00%)\n0:00 [0 rows, 0B] [0 rows/s, 0B/s]\n```\n\n#### 3.3 数组相关函数\n```\npresto:default\u003e select array_union(arr1, arr2) from (values (ARRAY [1,3,5,null], ARRAY [2,3,4,null])) as t(arr1, arr2);\n         _col0\n-----------------------****\n [1, 3, 5, null, 2, 4]\n(1 row)\n\nQuery 20160713_061707_00004_82kmt, FINISHED, 1 node\nSplits: 1 total, 0 done (0.00%)\n0:00 [0 rows, 0B] [0 rows/s, 0B/s]\n```\n\n```\npresto:default\u003e select value_count(arr1, 'a') from (values (ARRAY['a', 'b', 'a'])) t(arr1);\n _col0\n-------\n     2\n(1 row)\n\nQuery 20160721_111719_00008_xgf26, FINISHED, 1 node\nSplits: 1 total, 0 done (0.00%)\n0:00 [0 rows, 0B] [0 rows/s, 0B/s]\n```\n\n#### 3.4 JSON相关函数\n```\npresto:default\u003e select json_array_extract(arr1, '$.book.id') from (values ('[{\"book\":{\"id\":\"12\"}}, {\"book\":{\"id\":\"14\"}}]')) t(arr1);\n    _col0\n--------------\n [\"12\", \"14\"]\n(1 row)\n\nQuery 20160721_105423_00006_xgf26, FINISHED, 1 node\nSplits: 1 total, 0 done (0.00%)\n0:00 [0 rows, 0B] [0 rows/s, 0B/s]\n```\n\n\n```\npresto:default\u003e select json_array_extract_scalar(arr1, '$.book.id') from (values ('[{\"book\":{\"id\":\"12\"}}, {\"book\":{\"id\":\"14\"}}]')) t(arr1);\n  _col0\n----------\n [12, 14]\n(1 row)\n\nQuery 20160721_105426_00007_xgf26, FINISHED, 1 node\nSplits: 1 total, 0 done (0.00%)\n0:00 [0 rows, 0B] [0 rows/s, 0B/s]\n```\n\n#### 3.5 MAP相关函数\n```\npresto:default\u003e select map1, value_count(map1, 'a') from (values (map(ARRAY[1,2,3], ARRAY['a', 'b', 'a']))) t(map1);\n      map1       | _col1\n-----------------+-------\n {1=a, 2=b, 3=a} |     2\n(1 row)\n\nQuery 20160721_111906_00011_xgf26, FINISHED, 1 node\nSplits: 1 total, 0 done (0.00%)\n0:00 [0 rows, 0B] [0 rows/s, 0B/s]\n```\n\n#### 3.6 身份证相关函数\n```\npresto:default\u003e select id_card_info(card) from (values '110101198901084517') as t(card);\n                                      _col0\n----------------------------------------------------------------------------------\n {\"area\":\"东城区\",\"valid\":true,\"province\":\"北京市\",\"gender\":\"男\",\"city\":\"北京市\"}\n(1 row)\n\nQuery 20160712_071700_00004_hkbes, FINISHED, 1 node\nSplits: 1 total, 0 done (0.00%)\n0:00 [0 rows, 0B] [0 rows/s, 0B/s]\n```\n\n#### 3.7 坐标相关函数\n```\npresto:default\u003e select gcj_to_bd(lat,lng), bd_to_gcj(lat,lng), wgs_to_gcj(lat,lng), gcj_to_wgs(lat,lng), gcj_extract_wgs(lat,lng) from (values (39.915, 116.404)) as t(lat, lng)\\G;\n-[ RECORD 1 ]----------------------------------------------\n_col0 | {\"lng\":116.41036949371029,\"lat\":39.92133699351022}\n_col1 | {\"lng\":116.39762729119315,\"lat\":39.90865673957631}\n_col2 | {\"lng\":116.41024449916938,\"lat\":39.91640428150164}\n_col3 | {\"lng\":116.39775550083061,\"lat\":39.91359571849836}\n_col4 | {\"lng\":116.39775549316407,\"lat\":39.913596801757805}\n\nQuery 20160712_024714_00003_9rund, FINISHED, 1 node\nSplits: 1 total, 0 done (0.00%)\n0:00 [0 rows, 0B] [0 rows/s, 0B/s]\n```\n\n#### 3.8 其他函数\n```\npresto:default\u003e select is_null(col0),is_null(col1),is_null(col2),is_null(col3) from (values ('test', 1, 0.5, ARRAY [1]),(null, null, null, null)) as t(col0, col1, col2,col3);\n _col0 | _col1 | _col2 | _col3\n-------+-------+-------+-------\n false | false | false | false\n true  | true  | true  | true\n(2 rows)\n\nQuery 20160713_061435_00003_82kmt, FINISHED, 1 node\nSplits: 1 total, 0 done (0.00%)\n0:00 [0 rows, 0B] [0 rows/s, 0B/s]\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faaronshan%2Fpresto-third-functions","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faaronshan%2Fpresto-third-functions","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faaronshan%2Fpresto-third-functions/lists"}