{"id":21846907,"url":"https://github.com/duyet/clickhouse-udf-rs","last_synced_at":"2025-10-18T04:30:18.765Z","repository":{"id":223907495,"uuid":"761234853","full_name":"duyet/clickhouse-udf-rs","owner":"duyet","description":"Collection of some useful UDFs for ClickHouse written in Rust","archived":false,"fork":false,"pushed_at":"2025-04-14T04:44:33.000Z","size":127,"stargazers_count":8,"open_issues_count":2,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-04-14T05:34:27.322Z","etag":null,"topics":["clickhouse","rust"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/duyet.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-02-21T13:47:59.000Z","updated_at":"2024-11-17T12:56:07.000Z","dependencies_parsed_at":"2024-02-22T18:05:32.526Z","dependency_job_id":"3da1df46-c8ef-413f-8e41-9bb3c41e1168","html_url":"https://github.com/duyet/clickhouse-udf-rs","commit_stats":null,"previous_names":["duyet/clickhouse-udf-rs"],"tags_count":9,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/duyet%2Fclickhouse-udf-rs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/duyet%2Fclickhouse-udf-rs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/duyet%2Fclickhouse-udf-rs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/duyet%2Fclickhouse-udf-rs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/duyet","download_url":"https://codeload.github.com/duyet/clickhouse-udf-rs/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248888681,"owners_count":21178093,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clickhouse","rust"],"created_at":"2024-11-27T23:16:06.044Z","updated_at":"2025-10-18T04:30:13.716Z","avatar_url":"https://github.com/duyet.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ClickHouse UDF written in Rust \n\nCollection of some useful UDFs for ClickHouse written in Rust.\n\nCompile into binary\n\n```bash\n$ cargo build --release\n\n$ ls -lhp target/release | grep -v '/\\|\\.d'\n-rwxr-xr-x    1 duet  staff   434K Feb 24 21:26 read-wkt-linestring\n-rwxr-xr-x    1 duet  staff   434K Feb 24 21:26 vin-cleaner\n-rwxr-xr-x    1 duet  staff   434K Feb 24 21:26 vin-cleaner-chunk-header\n-rwxr-xr-x    1 duet  staff   434K Feb 24 21:26 vin-manuf\n-rwxr-xr-x    1 duet  staff   434K Feb 24 21:26 vin-manuf-chunk-header\n-rwxr-xr-x    1 duet  staff   434K Feb 24 21:26 vin-year\n-rwxr-xr-x    1 duet  staff   434K Feb 24 21:26 vin-year-chunk-header\n-rwxr-xr-x    1 duet  staff   434K Feb 24 21:26 extract-url\n-rwxr-xr-x    1 duet  staff   434K Feb 24 21:26 has-url\n-rwxr-xr-x    1 duet  staff   434K Feb 24 21:26 array-topk\n\n```\n\n1. [wkt](#1-wkt)\n2. [vin](#2-vin)\n3. [url](#3-url)\n4. [array](#4-array)\n\n\n# Usage\n\n## 1. `wkt`\n\n\n\u003cdetails\u003e\n  \u003csummary\u003e\n    Put the \u003cstrong\u003ewkt\u003c/strong\u003e binaries into \u003ccode\u003euser_scripts\u003c/code\u003e folder (\u003ccode\u003e/var/lib/clickhouse/user_scripts/\u003c/code\u003e with default path settings).\n  \u003c/summary\u003e\n\n  ```bash\n  $ cd /var/lib/clickhouse/user_scripts/\n  $ wget https://github.com/duyet/clickhouse-udf-rs/releases/download/0.1.8/clickhouse_udf_wkt_v0.1.8_x86_64-unknown-linux-musl.tar.gz\n  $ tar zxvf clickhouse_udf_wkt_v0.1.8_x86_64-unknown-linux-musl.tar.gz\n\n  read-wkt-linestring\n  \n  ```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003e\n    Creating UDF using XML configuration \u003ccode\u003ecustom_udf_wkt_function.xml\u003c/code\u003e\n  \u003c/summary\u003e\n\n  define udf config file `wkt_udf_function.xml` (`/etc/clickhouse-server/custom_udf_wkt_function.xml` with default path settings,\n  file name must be matched `*_function.xml`).\n\n\n  ```xml\n  \u003cfunctions\u003e\n    \u003c!-- wkt --\u003e\n    \u003cfunction\u003e\n        \u003cname\u003ereadWktLineString\u003c/name\u003e\n        \u003ctype\u003eexecutable_pool\u003c/type\u003e\n        \u003ccommand\u003eread-wkt-linestring\u003c/command\u003e\n        \u003cformat\u003eTabSeparated\u003c/format\u003e\n        \u003cargument\u003e\n            \u003ctype\u003eString\u003c/type\u003e\n            \u003cname\u003evalue\u003c/name\u003e\n        \u003c/argument\u003e\n        \u003creturn_type\u003eString\u003c/return_type\u003e\n    \u003c/function\u003e\n    \n  \u003c/functions\u003e\n  ```\n\u003c/details\u003e\n\n\n\n\n\u003cdetails\u003e\n  \u003csummary\u003eClickHouse example queries\u003c/summary\u003e\n\n  ```sql\n  SELECT readWktLineString(\"LINESTRING (30 10, 10 30, 40 40)\")\n  ```\n\u003c/details\u003e\n\n## 2. `vin`\n\n\n\u003cdetails\u003e\n  \u003csummary\u003e\n    Put the \u003cstrong\u003evin\u003c/strong\u003e binaries into \u003ccode\u003euser_scripts\u003c/code\u003e folder (\u003ccode\u003e/var/lib/clickhouse/user_scripts/\u003c/code\u003e with default path settings).\n  \u003c/summary\u003e\n\n  ```bash\n  $ cd /var/lib/clickhouse/user_scripts/\n  $ wget https://github.com/duyet/clickhouse-udf-rs/releases/download/0.1.8/clickhouse_udf_vin_v0.1.8_x86_64-unknown-linux-musl.tar.gz\n  $ tar zxvf clickhouse_udf_vin_v0.1.8_x86_64-unknown-linux-musl.tar.gz\n\n  vin-cleaner\n  vin-cleaner-chunk-header\n  vin-manuf\n  vin-manuf-chunk-header\n  vin-year\n  vin-year-chunk-header\n  \n  ```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003e\n    Creating UDF using XML configuration \u003ccode\u003ecustom_udf_vin_function.xml\u003c/code\u003e\n  \u003c/summary\u003e\n\n  define udf config file `vin_udf_function.xml` (`/etc/clickhouse-server/custom_udf_vin_function.xml` with default path settings,\n  file name must be matched `*_function.xml`).\n\n\n  ```xml\n  \u003cfunctions\u003e\n    \u003c!-- vin --\u003e\n    \u003cfunction\u003e\n        \u003cname\u003evinCleaner\u003c/name\u003e\n        \u003ctype\u003eexecutable_pool\u003c/type\u003e\n        \u003ccommand\u003evin-cleaner\u003c/command\u003e\n        \u003cformat\u003eTabSeparated\u003c/format\u003e\n        \u003cargument\u003e\n            \u003ctype\u003eString\u003c/type\u003e\n            \u003cname\u003evalue\u003c/name\u003e\n        \u003c/argument\u003e\n        \u003creturn_type\u003eString\u003c/return_type\u003e\n    \u003c/function\u003e\n    \u003cfunction\u003e\n        \u003cname\u003evinManuf\u003c/name\u003e\n        \u003ctype\u003eexecutable_pool\u003c/type\u003e\n        \u003ccommand\u003evin-manuf\u003c/command\u003e\n        \u003cformat\u003eTabSeparated\u003c/format\u003e\n        \u003cargument\u003e\n            \u003ctype\u003eString\u003c/type\u003e\n            \u003cname\u003evalue\u003c/name\u003e\n        \u003c/argument\u003e\n        \u003creturn_type\u003eString\u003c/return_type\u003e\n    \u003c/function\u003e\n    \u003cfunction\u003e\n        \u003cname\u003evinYear\u003c/name\u003e\n        \u003ctype\u003eexecutable_pool\u003c/type\u003e\n        \u003ccommand\u003evin-year\u003c/command\u003e\n        \u003cformat\u003eTabSeparated\u003c/format\u003e\n        \u003cargument\u003e\n            \u003ctype\u003eString\u003c/type\u003e\n            \u003cname\u003evalue\u003c/name\u003e\n        \u003c/argument\u003e\n        \u003creturn_type\u003eString\u003c/return_type\u003e\n    \u003c/function\u003e\n    \n  \u003c/functions\u003e\n  ```\n\u003c/details\u003e\n\n\n\n\n\n\n\n\n\u003cdetails\u003e\n  \u003csummary\u003eUDF config with \u003ccode\u003e\u0026lt;send_chunk_header\u0026gt;1\u0026lt;\u0026#x2F;send_chunk_header\u0026gt;\u003c/code\u003e\u003c/summary\u003e\n\n  ```xml\n  \u003cfunctions\u003e\n      \u003c!-- vin --\u003e\n      \n      \u003cfunction\u003e\n          \u003cname\u003evinCleaner\u003c/name\u003e\n          \u003ctype\u003eexecutable_pool\u003c/type\u003e\n\n          \u003ccommand\u003evin-cleaner-chunk-header\u003c/command\u003e\n          \u003csend_chunk_header\u003e1\u003c/send_chunk_header\u003e\n\n          \u003cformat\u003eTabSeparated\u003c/format\u003e\n          \u003cargument\u003e\n              \u003ctype\u003eString\u003c/type\u003e\n              \u003cname\u003evalue\u003c/name\u003e\n          \u003c/argument\u003e\n          \u003creturn_type\u003eString\u003c/return_type\u003e\n      \u003c/function\u003e\n      \n      \u003cfunction\u003e\n          \u003cname\u003evinManuf\u003c/name\u003e\n          \u003ctype\u003eexecutable_pool\u003c/type\u003e\n\n          \u003ccommand\u003evin-manuf-chunk-header\u003c/command\u003e\n          \u003csend_chunk_header\u003e1\u003c/send_chunk_header\u003e\n\n          \u003cformat\u003eTabSeparated\u003c/format\u003e\n          \u003cargument\u003e\n              \u003ctype\u003eString\u003c/type\u003e\n              \u003cname\u003evalue\u003c/name\u003e\n          \u003c/argument\u003e\n          \u003creturn_type\u003eString\u003c/return_type\u003e\n      \u003c/function\u003e\n      \n      \u003cfunction\u003e\n          \u003cname\u003evinYear\u003c/name\u003e\n          \u003ctype\u003eexecutable_pool\u003c/type\u003e\n\n          \u003ccommand\u003evin-year-chunk-header\u003c/command\u003e\n          \u003csend_chunk_header\u003e1\u003c/send_chunk_header\u003e\n\n          \u003cformat\u003eTabSeparated\u003c/format\u003e\n          \u003cargument\u003e\n              \u003ctype\u003eString\u003c/type\u003e\n              \u003cname\u003evalue\u003c/name\u003e\n          \u003c/argument\u003e\n          \u003creturn_type\u003eString\u003c/return_type\u003e\n      \u003c/function\u003e\n      \u003c/functions\u003e\n  ```\n\n\u003c/details\u003e\n\n\n\u003cdetails\u003e\n  \u003csummary\u003eClickHouse example queries\u003c/summary\u003e\n\n  ```sql\n  SELECT vinCleaner(\"1G1JC1249Y7150000\")\n  SELECT vinCleaner(\"1G1JC1249Y7150000 ...\")\n  \n  SELECT vinManuf(\"1G1JC1249Y7150000\")\n  \n  SELECT vinYear(\"1G1JC1249Y7150000\")\n  ```\n\u003c/details\u003e\n\n## 3. `url`\n\n\n\u003cdetails\u003e\n  \u003csummary\u003e\n    Put the \u003cstrong\u003eurl\u003c/strong\u003e binaries into \u003ccode\u003euser_scripts\u003c/code\u003e folder (\u003ccode\u003e/var/lib/clickhouse/user_scripts/\u003c/code\u003e with default path settings).\n  \u003c/summary\u003e\n\n  ```bash\n  $ cd /var/lib/clickhouse/user_scripts/\n  $ wget https://github.com/duyet/clickhouse-udf-rs/releases/download/0.1.8/clickhouse_udf_url_v0.1.8_x86_64-unknown-linux-musl.tar.gz\n  $ tar zxvf clickhouse_udf_url_v0.1.8_x86_64-unknown-linux-musl.tar.gz\n\n  extract-url\n  has-url\n  \n  ```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003e\n    Creating UDF using XML configuration \u003ccode\u003ecustom_udf_url_function.xml\u003c/code\u003e\n  \u003c/summary\u003e\n\n  define udf config file `url_udf_function.xml` (`/etc/clickhouse-server/custom_udf_url_function.xml` with default path settings,\n  file name must be matched `*_function.xml`).\n\n\n  ```xml\n  \u003cfunctions\u003e\n    \u003c!-- url --\u003e\n    \u003cfunction\u003e\n        \u003cname\u003eextractUrl\u003c/name\u003e\n        \u003ctype\u003eexecutable_pool\u003c/type\u003e\n        \u003ccommand\u003eextract-url\u003c/command\u003e\n        \u003cformat\u003eTabSeparated\u003c/format\u003e\n        \u003cargument\u003e\n            \u003ctype\u003eString\u003c/type\u003e\n            \u003cname\u003evalue\u003c/name\u003e\n        \u003c/argument\u003e\n        \u003creturn_type\u003eString\u003c/return_type\u003e\n    \u003c/function\u003e\n    \u003cfunction\u003e\n        \u003cname\u003ehasUrl\u003c/name\u003e\n        \u003ctype\u003eexecutable_pool\u003c/type\u003e\n        \u003ccommand\u003ehas-url\u003c/command\u003e\n        \u003cformat\u003eTabSeparated\u003c/format\u003e\n        \u003cargument\u003e\n            \u003ctype\u003eString\u003c/type\u003e\n            \u003cname\u003evalue\u003c/name\u003e\n        \u003c/argument\u003e\n        \u003creturn_type\u003eString\u003c/return_type\u003e\n    \u003c/function\u003e\n    \n  \u003c/functions\u003e\n  ```\n\u003c/details\u003e\n\n\n\n\n\n\u003cdetails\u003e\n  \u003csummary\u003eClickHouse example queries\u003c/summary\u003e\n\n  ```sql\n  SELECT extractUrl(\"extract from this https://duyet.net\")\n  \n  SELECT hasUrl(\"extract from this https://duyet.net\")\n  SELECT hasUrl(\"no url here\")\n  ```\n\u003c/details\u003e\n\n## 4. `array`\n\n\n\u003cdetails\u003e\n  \u003csummary\u003e\n    Put the \u003cstrong\u003earray\u003c/strong\u003e binaries into \u003ccode\u003euser_scripts\u003c/code\u003e folder (\u003ccode\u003e/var/lib/clickhouse/user_scripts/\u003c/code\u003e with default path settings).\n  \u003c/summary\u003e\n\n  ```bash\n  $ cd /var/lib/clickhouse/user_scripts/\n  $ wget https://github.com/duyet/clickhouse-udf-rs/releases/download/0.1.8/clickhouse_udf_array_v0.1.8_x86_64-unknown-linux-musl.tar.gz\n  $ tar zxvf clickhouse_udf_array_v0.1.8_x86_64-unknown-linux-musl.tar.gz\n\n  array-topk\n  \n  ```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003e\n    Creating UDF using XML configuration \u003ccode\u003ecustom_udf_array_function.xml\u003c/code\u003e\n  \u003c/summary\u003e\n\n  define udf config file `array_udf_function.xml` (`/etc/clickhouse-server/custom_udf_array_function.xml` with default path settings,\n  file name must be matched `*_function.xml`).\n\n\n  ```xml\n  \u003cfunctions\u003e\n    \u003c!-- array --\u003e\n    \u003cfunction\u003e\n        \u003cname\u003earrayTopK\u003c/name\u003e\n        \u003ctype\u003eexecutable_pool\u003c/type\u003e\n        \u003ccommand\u003earray-topk\u003c/command\u003e\n        \u003cformat\u003eTabSeparated\u003c/format\u003e\n        \u003cargument\u003e\n            \u003ctype\u003eString\u003c/type\u003e\n            \u003cname\u003evalue\u003c/name\u003e\n        \u003c/argument\u003e\n        \u003creturn_type\u003eString\u003c/return_type\u003e\n    \u003c/function\u003e\n    \n  \u003c/functions\u003e\n  ```\n\u003c/details\u003e\n\n\n\n\n\u003cdetails\u003e\n  \u003csummary\u003eClickHouse example queries\u003c/summary\u003e\n\n  ```sql\n  SELECT arrayTopK(3)([1, 1, 2, 2, 3, 4, 5])\n  SELECT arrayTopK(1)([2, 3, 4, 5])\n  ```\n\u003c/details\u003e\n\n\n\n# Generate README\n\n```bash\nRELEASE_VERSION=0.1.8 cargo run --bin readme-generator . \u003e README.md\n```\n\n# License\n\nMIT\n\nDone\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fduyet%2Fclickhouse-udf-rs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fduyet%2Fclickhouse-udf-rs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fduyet%2Fclickhouse-udf-rs/lists"}