{"id":13849636,"url":"https://github.com/qubole/presto-udfs","last_synced_at":"2025-07-11T05:30:36.949Z","repository":{"id":28175764,"uuid":"31677182","full_name":"qubole/presto-udfs","owner":"qubole","description":"Plugin for Presto to allow addition of user functions easily","archived":false,"fork":false,"pushed_at":"2021-03-31T20:07:29.000Z","size":117,"stargazers_count":115,"open_issues_count":1,"forks_count":65,"subscribers_count":21,"default_branch":"master","last_synced_at":"2024-08-05T20:28:21.238Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/qubole.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-03-04T20:02:33.000Z","updated_at":"2024-05-23T13:12:26.000Z","dependencies_parsed_at":"2022-09-04T16:50:44.110Z","dependency_job_id":null,"html_url":"https://github.com/qubole/presto-udfs","commit_stats":null,"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qubole%2Fpresto-udfs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qubole%2Fpresto-udfs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qubole%2Fpresto-udfs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qubole%2Fpresto-udfs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/qubole","download_url":"https://codeload.github.com/qubole/presto-udfs/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225693821,"owners_count":17509227,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-04T20:00:31.227Z","updated_at":"2024-11-21T08:06:56.124Z","avatar_url":"https://github.com/qubole.png","language":"Java","funding_links":[],"categories":["Libraries"],"sub_categories":[],"readme":"\u003c!--\n{% comment %}\n  Copyright (c) 2016. Qubole Inc\n  Licensed under the Apache License, Version 2.0 (the \"License\");\n  you may not use this file except in compliance with the License.\n  You may obtain a copy of the License at\n\n    http://www.apache.org/licenses/LICENSE-2.0\n\n  Unless required by applicable law or agreed to in writing, software\n  distributed under the License is distributed on an \"AS IS\" BASIS,\n  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n  See the License for the specific language governing permissions and\n  limitations under the License. See accompanying LICENSE file.\n{% endcomment %}\n--\u003e\n# Presto User-Defined Functions(UDFs)\nPlugin for Presto to allow addition of user defined functions. The plugin simplifies the process of adding user functions to Presto.\n\n## Plugging in Presto UDFs\nThe details about how to plug in presto UDFs can be found [here](https://www.qubole.com/blog/product/plugging-in-presto-udfs/?nabe=5695374637924352:1).\n\n## Presto Version Compatibility\n\n| Presto Version| Last Compatible Release|\n| ---------------- |:----------:|\n| _ver 300+_       | current    |\n| _ver 0.193-0.2xx_| udfs-2.0.3 |\n| _ver 0.180_      | udfs-2.0.2 |\n| _ver 0.157_      | udfs-2.0.1 |\n| _ver 0.142_      | udfs-1.0.0 |\n| _ver 0.119_      | udfs-0.1.3 |\n\n## Implemented User Defined Functions\nThe repository contains the following UDFs implemented for Presto :\n\n#### HIVE UDFs\n* **DATE-TIME Functions**\n 1. **to_utc_timestamp(timestamp, string timezone) -\u003e timestamp** \u003cbr /\u003e\n      Assumes given timestamp is in given timezone and converts to UTC (as of Hive 0.8.0). For example, to_utc_timestamp('1970-01-01 00:00:00','PST') returns 1970-01-01 08:00:00.\n 2. **from_utc_timestamp(timestamp, string timezone) -\u003e timestamp**\u003cbr /\u003e\n      Assumes given timestamp is UTC and converts to given timezone (as of Hive 0.8.0). For example, from_utc_timestamp('1970-01-01 08:00:00','PST') returns 1970-01-01 00:00:00.\n 3. **unix_timestamp() -\u003e timestamp**\u003cbr /\u003e\n      Gets current Unix timestamp in seconds.\n 4. **year(string date) -\u003e int**\u003cbr /\u003e\n      Returns the year part of a date or a timestamp string: year(\"1970-01-01 00:00:00\") = 1970, year(\"1970-01-01\") = 1970.\n 5. **month(string date) -\u003e int**\u003cbr /\u003e\n      Returns the month part of a date or a timestamp string: month(\"1970-11-01 00:00:00\") = 11, month(\"1970-11-01\") = 11.\n 6. **day(string date) -\u003e int**\u003cbr /\u003e\n      Returns the day part of a date or a timestamp string: day(\"1970-11-01 00:00:00\") = 1, day(\"1970-11-01\") = 1.\n 7. **hour(string date) -\u003e int**\u003cbr /\u003e\n      Returns the hour of the timestamp: hour('2009-07-30 12:58:59') = 12, hour('12:58:59') = 12.\n 8. **minute(string date) -\u003e int**\u003cbr /\u003e\n      Returns the minute of the timestamp: minute('2009-07-30 12:58:59') = 58, minute('12:58:59') = 58.\n 9. **second(string date) -\u003e int**\u003cbr /\u003e\n      Returns the second of the timestamp: second('2009-07-30 12:58:59') = 59, second('12:58:59') = 59.\n 10. **to_date(string timestamp) -\u003e string**\u003cbr /\u003e\n      Returns the date part of a timestamp string: to_date(\"1970-01-01 00:00:00\") = \"1970-01-01\"\n 11. **weekofyear(string date) -\u003e int**\u003cbr /\u003e\n      Returns the week number of a timestamp string: weekofyear(\"1970-11-01 00:00:00\") = 44, weekofyear(\"1970-11-01\") = 44.\n 12. **date_sub(string startdate, int days) -\u003e string**\u003cbr /\u003e\n      Subtracts a number of days to startdate: date_sub('2008-12-31', 1) = '2008-12-30'.\n 13. **date_add(string startdate, int days) -\u003e string**\u003cbr /\u003e\n      Adds a number of days to startdate: date_add('2008-12-31', 1) = '2009-01-01'.\n 14. **datediff(string enddate, string startdate) -\u003e string**\u003cbr /\u003e\n      Returns the number of days from startdate to enddate: datediff('2009-03-01', '2009-02-27') = 2.\n 15. **format_unixtimestamp(bigint unixtime[, string format]) -\u003e string**\u003cbr /\u003e\n      Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the format of \"1970-01-01 00:00:00\" unless a format string is specified. If a format string is specified the epoch time is converted in the specified format. More information about the formatter can be found [here](https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html).\u003cbr /\u003e\n      _**NOTE :** Due to name collision of presto 0.142's implementaion of `from_unixtime(bigint unixtime)` function, which returns the value as a timestamp type and Hive's `from_unixtime(bigint unixtime[, string format])` function, which returns the value as string type and supports formatter, the hive UDF has been implemented as `format_unixtimestamp(bigint unixtime[, string format])`._\n 16. **from_duration(string duration, string duration_unit) -\u003e double**\u003cbr /\u003e\n      Converts a string representing time duration in airlift's Duration format (https://github.com/airlift/units/blob/master/src/main/java/io/airlift/units/Duration.java) to a double representing time in specified unit: from_duration('4h', 'ms') = 1.44E7.\n 17. **from_datasize(string datasize, string size_unit) -\u003e double**\u003cbr /\u003e\n       Converts a string representing data size in airlift's DataSize format (https://github.com/airlift/units/blob/master/src/main/java/io/airlift/units/DataSize.java) to a double representing size in specified unit: from_datasize('1GB', 'B') = 1.073741824E9.\n\n\n* **MATH Functions**\n 1. **pmod(INT a, INT b) -\u003e INT, pmod(DOUBLE a, DOUBLE b) -\u003e DOUBLE**\u003cbr /\u003e\n      Returns the positive value of a mod b: pmod(17, -5) = -3.\n 2. **rands(INT seed) -\u003e DOUBLE**\u003cbr /\u003e\n      Returns a random number (that changes from row to row) that is distributed uniformly from 0 to 1. Specifying the seed will make sure the generated random number sequence is deterministic: rands(3) = 0.731057369148862 \u003cbr /\u003e\n      _**NOTE :** Due to name collision of presto 0.142's implementaion of `rand(int a)` function, which returns a number between 0 to a and Hive's `rand(int seed)` function, which sets the seed for the random number generator, the hive UDF has been implemented as `rands(int seed)`._\n 3. **bin(BIGINT a) -\u003e STRING**\u003cbr /\u003e\n      Returns the number in binary format: bin(100) = 1100100.\n 4. **hex(BIGINT a) -\u003e STRING, hex(STRING a) -\u003e STRING, hex(BINARY a) -\u003e STRING**\u003cbr /\u003e\n      If the argument is an INT or binary, hex returns the number as a STRING in hexadecimal format. Otherwise if the number is a STRING, it converts each character into its hexadecimal representation and returns the resulting STRING:  hex(123) = 7b, hex('123') = 7b, hex('1100100') = 64.\n 5. **unhex(STRING a) -\u003e BINARY**\u003cbr /\u003e\n      Inverse of hex. Interprets each pair of characters as a hexadecimal number and converts to the byte representation of the number: unhex('7b') = 1111011.\n\n* **STRING Functions**\n 1. **locate(string substr, string str[, int pos]) -\u003e int** \u003cbr /\u003e\n      Returns the position of the first occurrence of substr in str after position pos: locate('si', 'mississipi', 2) = 4, locate('si', 'mississipi', 5) = 7\n 2. **find_in_set(string str, string strList) -\u003e int** \u003cbr /\u003e\n      Returns the first occurance of str in strList where strList is a comma-delimited string. Returns null if either argument is null. Returns 0 if the first argument contains any commas:  find_in_set('ab', 'abc,b,ab,c,def') returns 3.\n 3. **instr(string str, string substr) -\u003e int** \u003cbr /\u003e\n      Returns the position of the first occurrence of substr in str. Returns null if either of the arguments are null and returns 0 if substr could not be found in str: instr('mississipi' , 'si') = 4.\n\n* **CONDITIONAL Functions**\n  1. **nvl(T value, T default_value) -\u003e T**\u003cbr/\u003e\n      ** Supported only till v1.0.0 due to the limitations presto new versions of Presto puts on plugins\n      Returns default value if value is null else returns value: nvl(3,4) = 3, nvl(NULL,4) = 4.\n\n* **MISCELLANEOUS Functions**\n  1. **hash(a1[, a2...]) -\u003e int**\u003cbr/\u003e\n      ** Supported only till v1.0.0 due to the limitations presto new versions of Presto puts on plugins\n      Returns a hash value of the arguments. hash('a','b','c') = 143025634.\n\n## Adding User Defined Functions to Presto-UDFs\n Functions can be added using annotations, follow https://prestosql.io/docs/current/develop/functions.html for details on how to add functions\n\n  ** Note that Code generated functions were supported only till v1.0.0 due to the limitations presto new versions of Presto puts on plugins\n\n## Release a new version of presto-udfs\nReleases are always created from `master`. During development, `master`\nhas a version like `X.Y.Z-SNAPSHOT`.\n\n    # Change version as per http://semver.org/\n    mvn release:prepare -Prelease\n    mvn release:perform -Prelease\n    git push\n    git push --tags\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqubole%2Fpresto-udfs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fqubole%2Fpresto-udfs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqubole%2Fpresto-udfs/lists"}