https://github.com/duyet/clickhouse-udf-rs
Collection of some useful UDFs for ClickHouse written in Rust
https://github.com/duyet/clickhouse-udf-rs
clickhouse rust
Last synced: 8 months ago
JSON representation
Collection of some useful UDFs for ClickHouse written in Rust
- Host: GitHub
- URL: https://github.com/duyet/clickhouse-udf-rs
- Owner: duyet
- Created: 2024-02-21T13:47:59.000Z (over 2 years ago)
- Default Branch: master
- Last Pushed: 2025-04-14T04:44:33.000Z (about 1 year ago)
- Last Synced: 2025-04-14T05:34:27.322Z (about 1 year ago)
- Topics: clickhouse, rust
- Language: Rust
- Homepage:
- Size: 124 KB
- Stars: 8
- Watchers: 3
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ClickHouse UDF written in Rust
Collection of some useful UDFs for ClickHouse written in Rust.
Compile into binary
```bash
$ cargo build --release
$ ls -lhp target/release | grep -v '/\|\.d'
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 read-wkt-linestring
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 vin-cleaner
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 vin-cleaner-chunk-header
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 vin-manuf
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 vin-manuf-chunk-header
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 vin-year
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 vin-year-chunk-header
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 extract-url
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 has-url
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 array-topk
```
1. [wkt](#1-wkt)
2. [vin](#2-vin)
3. [url](#3-url)
4. [array](#4-array)
# Usage
## 1. `wkt`
Put the wkt binaries into user_scripts folder (/var/lib/clickhouse/user_scripts/ with default path settings).
```bash
$ cd /var/lib/clickhouse/user_scripts/
$ wget https://github.com/duyet/clickhouse-udf-rs/releases/download/0.1.8/clickhouse_udf_wkt_v0.1.8_x86_64-unknown-linux-musl.tar.gz
$ tar zxvf clickhouse_udf_wkt_v0.1.8_x86_64-unknown-linux-musl.tar.gz
read-wkt-linestring
```
Creating UDF using XML configuration custom_udf_wkt_function.xml
define udf config file `wkt_udf_function.xml` (`/etc/clickhouse-server/custom_udf_wkt_function.xml` with default path settings,
file name must be matched `*_function.xml`).
```xml
readWktLineString
executable_pool
read-wkt-linestring
TabSeparated
String
value
String
```
ClickHouse example queries
```sql
SELECT readWktLineString("LINESTRING (30 10, 10 30, 40 40)")
```
## 2. `vin`
Put the vin binaries into user_scripts folder (/var/lib/clickhouse/user_scripts/ with default path settings).
```bash
$ cd /var/lib/clickhouse/user_scripts/
$ wget https://github.com/duyet/clickhouse-udf-rs/releases/download/0.1.8/clickhouse_udf_vin_v0.1.8_x86_64-unknown-linux-musl.tar.gz
$ tar zxvf clickhouse_udf_vin_v0.1.8_x86_64-unknown-linux-musl.tar.gz
vin-cleaner
vin-cleaner-chunk-header
vin-manuf
vin-manuf-chunk-header
vin-year
vin-year-chunk-header
```
Creating UDF using XML configuration custom_udf_vin_function.xml
define udf config file `vin_udf_function.xml` (`/etc/clickhouse-server/custom_udf_vin_function.xml` with default path settings,
file name must be matched `*_function.xml`).
```xml
vinCleaner
executable_pool
vin-cleaner
TabSeparated
String
value
String
vinManuf
executable_pool
vin-manuf
TabSeparated
String
value
String
vinYear
executable_pool
vin-year
TabSeparated
String
value
String
```
UDF config with <send_chunk_header>1</send_chunk_header>
```xml
vinCleaner
executable_pool
vin-cleaner-chunk-header
1
TabSeparated
String
value
String
vinManuf
executable_pool
vin-manuf-chunk-header
1
TabSeparated
String
value
String
vinYear
executable_pool
vin-year-chunk-header
1
TabSeparated
String
value
String
```
ClickHouse example queries
```sql
SELECT vinCleaner("1G1JC1249Y7150000")
SELECT vinCleaner("1G1JC1249Y7150000 ...")
SELECT vinManuf("1G1JC1249Y7150000")
SELECT vinYear("1G1JC1249Y7150000")
```
## 3. `url`
Put the url binaries into user_scripts folder (/var/lib/clickhouse/user_scripts/ with default path settings).
```bash
$ cd /var/lib/clickhouse/user_scripts/
$ wget https://github.com/duyet/clickhouse-udf-rs/releases/download/0.1.8/clickhouse_udf_url_v0.1.8_x86_64-unknown-linux-musl.tar.gz
$ tar zxvf clickhouse_udf_url_v0.1.8_x86_64-unknown-linux-musl.tar.gz
extract-url
has-url
```
Creating UDF using XML configuration custom_udf_url_function.xml
define udf config file `url_udf_function.xml` (`/etc/clickhouse-server/custom_udf_url_function.xml` with default path settings,
file name must be matched `*_function.xml`).
```xml
extractUrl
executable_pool
extract-url
TabSeparated
String
value
String
hasUrl
executable_pool
has-url
TabSeparated
String
value
String
```
ClickHouse example queries
```sql
SELECT extractUrl("extract from this https://duyet.net")
SELECT hasUrl("extract from this https://duyet.net")
SELECT hasUrl("no url here")
```
## 4. `array`
Put the array binaries into user_scripts folder (/var/lib/clickhouse/user_scripts/ with default path settings).
```bash
$ cd /var/lib/clickhouse/user_scripts/
$ wget https://github.com/duyet/clickhouse-udf-rs/releases/download/0.1.8/clickhouse_udf_array_v0.1.8_x86_64-unknown-linux-musl.tar.gz
$ tar zxvf clickhouse_udf_array_v0.1.8_x86_64-unknown-linux-musl.tar.gz
array-topk
```
Creating UDF using XML configuration custom_udf_array_function.xml
define udf config file `array_udf_function.xml` (`/etc/clickhouse-server/custom_udf_array_function.xml` with default path settings,
file name must be matched `*_function.xml`).
```xml
arrayTopK
executable_pool
array-topk
TabSeparated
String
value
String
```
ClickHouse example queries
```sql
SELECT arrayTopK(3)([1, 1, 2, 2, 3, 4, 5])
SELECT arrayTopK(1)([2, 3, 4, 5])
```
# Generate README
```bash
RELEASE_VERSION=0.1.8 cargo run --bin readme-generator . > README.md
```
# License
MIT
Done