An open API service indexing awesome lists of open source software.

https://github.com/duyet/clickhouse-udf-rs

Collection of some useful UDFs for ClickHouse written in Rust
https://github.com/duyet/clickhouse-udf-rs

clickhouse rust

Last synced: 8 months ago
JSON representation

Collection of some useful UDFs for ClickHouse written in Rust

Awesome Lists containing this project

README

          

# ClickHouse UDF written in Rust

Collection of some useful UDFs for ClickHouse written in Rust.

Compile into binary

```bash
$ cargo build --release

$ ls -lhp target/release | grep -v '/\|\.d'
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 read-wkt-linestring
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 vin-cleaner
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 vin-cleaner-chunk-header
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 vin-manuf
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 vin-manuf-chunk-header
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 vin-year
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 vin-year-chunk-header
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 extract-url
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 has-url
-rwxr-xr-x 1 duet staff 434K Feb 24 21:26 array-topk

```

1. [wkt](#1-wkt)
2. [vin](#2-vin)
3. [url](#3-url)
4. [array](#4-array)

# Usage

## 1. `wkt`


Put the wkt binaries into user_scripts folder (/var/lib/clickhouse/user_scripts/ with default path settings).

```bash
$ cd /var/lib/clickhouse/user_scripts/
$ wget https://github.com/duyet/clickhouse-udf-rs/releases/download/0.1.8/clickhouse_udf_wkt_v0.1.8_x86_64-unknown-linux-musl.tar.gz
$ tar zxvf clickhouse_udf_wkt_v0.1.8_x86_64-unknown-linux-musl.tar.gz

read-wkt-linestring

```


Creating UDF using XML configuration custom_udf_wkt_function.xml

define udf config file `wkt_udf_function.xml` (`/etc/clickhouse-server/custom_udf_wkt_function.xml` with default path settings,
file name must be matched `*_function.xml`).

```xml



readWktLineString
executable_pool
read-wkt-linestring
TabSeparated

String
value

String



```

ClickHouse example queries

```sql
SELECT readWktLineString("LINESTRING (30 10, 10 30, 40 40)")
```

## 2. `vin`


Put the vin binaries into user_scripts folder (/var/lib/clickhouse/user_scripts/ with default path settings).

```bash
$ cd /var/lib/clickhouse/user_scripts/
$ wget https://github.com/duyet/clickhouse-udf-rs/releases/download/0.1.8/clickhouse_udf_vin_v0.1.8_x86_64-unknown-linux-musl.tar.gz
$ tar zxvf clickhouse_udf_vin_v0.1.8_x86_64-unknown-linux-musl.tar.gz

vin-cleaner
vin-cleaner-chunk-header
vin-manuf
vin-manuf-chunk-header
vin-year
vin-year-chunk-header

```


Creating UDF using XML configuration custom_udf_vin_function.xml

define udf config file `vin_udf_function.xml` (`/etc/clickhouse-server/custom_udf_vin_function.xml` with default path settings,
file name must be matched `*_function.xml`).

```xml



vinCleaner
executable_pool
vin-cleaner
TabSeparated

String
value

String


vinManuf
executable_pool
vin-manuf
TabSeparated

String
value

String


vinYear
executable_pool
vin-year
TabSeparated

String
value

String



```

UDF config with <send_chunk_header>1</send_chunk_header>

```xml




vinCleaner
executable_pool

vin-cleaner-chunk-header
1

TabSeparated

String
value

String



vinManuf
executable_pool

vin-manuf-chunk-header
1

TabSeparated

String
value

String



vinYear
executable_pool

vin-year-chunk-header
1

TabSeparated

String
value

String


```

ClickHouse example queries

```sql
SELECT vinCleaner("1G1JC1249Y7150000")
SELECT vinCleaner("1G1JC1249Y7150000 ...")

SELECT vinManuf("1G1JC1249Y7150000")

SELECT vinYear("1G1JC1249Y7150000")
```

## 3. `url`


Put the url binaries into user_scripts folder (/var/lib/clickhouse/user_scripts/ with default path settings).

```bash
$ cd /var/lib/clickhouse/user_scripts/
$ wget https://github.com/duyet/clickhouse-udf-rs/releases/download/0.1.8/clickhouse_udf_url_v0.1.8_x86_64-unknown-linux-musl.tar.gz
$ tar zxvf clickhouse_udf_url_v0.1.8_x86_64-unknown-linux-musl.tar.gz

extract-url
has-url

```


Creating UDF using XML configuration custom_udf_url_function.xml

define udf config file `url_udf_function.xml` (`/etc/clickhouse-server/custom_udf_url_function.xml` with default path settings,
file name must be matched `*_function.xml`).

```xml



extractUrl
executable_pool
extract-url
TabSeparated

String
value

String


hasUrl
executable_pool
has-url
TabSeparated

String
value

String



```

ClickHouse example queries

```sql
SELECT extractUrl("extract from this https://duyet.net")

SELECT hasUrl("extract from this https://duyet.net")
SELECT hasUrl("no url here")
```

## 4. `array`


Put the array binaries into user_scripts folder (/var/lib/clickhouse/user_scripts/ with default path settings).

```bash
$ cd /var/lib/clickhouse/user_scripts/
$ wget https://github.com/duyet/clickhouse-udf-rs/releases/download/0.1.8/clickhouse_udf_array_v0.1.8_x86_64-unknown-linux-musl.tar.gz
$ tar zxvf clickhouse_udf_array_v0.1.8_x86_64-unknown-linux-musl.tar.gz

array-topk

```


Creating UDF using XML configuration custom_udf_array_function.xml

define udf config file `array_udf_function.xml` (`/etc/clickhouse-server/custom_udf_array_function.xml` with default path settings,
file name must be matched `*_function.xml`).

```xml



arrayTopK
executable_pool
array-topk
TabSeparated

String
value

String



```

ClickHouse example queries

```sql
SELECT arrayTopK(3)([1, 1, 2, 2, 3, 4, 5])
SELECT arrayTopK(1)([2, 3, 4, 5])
```

# Generate README

```bash
RELEASE_VERSION=0.1.8 cargo run --bin readme-generator . > README.md
```

# License

MIT

Done