An open API service indexing awesome lists of open source software.

https://github.com/quackscience/duckdb-extension-clickhouse-native

Experimental ClickHouse Native Client and Native file reader Extension for DuckDB chsql
https://github.com/quackscience/duckdb-extension-clickhouse-native

clickhouse clickhouse-client clickhouse-native duckdb duckdb-extension reader

Last synced: about 1 year ago
JSON representation

Experimental ClickHouse Native Client and Native file reader Extension for DuckDB chsql

Awesome Lists containing this project

README

          



# DuckDB Clickhouse Native Extension for [chsql](https://github.com/quackscience/duckdb-extension-clickhouse-sql)
Experimental ClickHouse Native Client and Native file reader for DuckDB chsql

### ๐Ÿ“ฆ Installation
```sql
INSTALL chsql_native FROM community;
LOAD chsql_native;
```

## ๐Ÿค– Native Client
The extension provides an experimental clickhouse native client: `clickhouse_scan`
### ๐Ÿ Settings
```bash
# Local Setup, Insecure
export CLICKHOUSE_URL="tcp://localhost:9000"
# Remote Setup, Secure
export CLICKHOUSE_URL="tcp://user:pass@remote:9440/?secure=true&skip_verify=true"
```
### โœ๏ธ Usage
```sql
D SELECT * FROM clickhouse_scan("SELECT version(), 'hello', 123");
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ version() โ”‚ 'hello' โ”‚ 123 โ”‚
โ”‚ varchar โ”‚ varchar โ”‚ uint32 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 24.10.2.80 โ”‚ hello โ”‚ 123 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```

## ๐Ÿค– Native Reader
The extension provides an experimental clickhouse native file reader: `clickhouse_native`

### ๐Ÿ Input
Generate some native files with `clickhouse-local` or `clickhouse-server`

```sql
--- simple w/ one row, two columns
SELECT version(), number FROM numbers(1) INTO OUTFILE '/tmp/numbers.clickhouse' FORMAT Native;
--- simple w/ one column, 100000 rows
SELECT number FROM numbers(100000) INTO OUTFILE '/tmp/100000.clickhouse' FORMAT Native;
--- complex w/ multiple types
SELECT * FROM system.functions LIMIT 10 INTO OUTFILE '/tmp/functions.clickhouse' FORMAT Native;
```

### โœ๏ธ Usage
Read ClickHouse Native files with DuckDB. Reads are full-scans at this time.

```sql
D SELECT * FROM clickhouse_native('/tmp/numbers.clickhouse');
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ version() โ”‚ number โ”‚
โ”‚ varchar โ”‚ int32 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 24.12.1.1273 โ”‚ 0 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```
```sql
D SELECT count(*), max(number) FROM clickhouse_native('/tmp/100000.clickhouse');
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ count_star() โ”‚ max(number) โ”‚
โ”‚ int64 โ”‚ int32 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 100000 โ”‚ 99999 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```
```sql
D SELECT * FROM clickhouse_native('/tmp/functions.clickhouse') WHERE alias_to != '' LIMIT 10;
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ name โ”‚ is_aggregate โ”‚ case_insensitive โ”‚ alias_to โ”‚ create_query โ”‚ origin โ”‚ โ€ฆ โ”‚ syntax โ”‚ arguments โ”‚ returned_value โ”‚ examples โ”‚ categories โ”‚
โ”‚ varchar โ”‚ int32 โ”‚ int32 โ”‚ varchar โ”‚ varchar โ”‚ varchar โ”‚ โ”‚ varchar โ”‚ varchar โ”‚ varchar โ”‚ varchar โ”‚ varchar โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ connection_id โ”‚ 0 โ”‚ 1 โ”‚ connectionID โ”‚ โ”‚ System โ”‚ โ€ฆ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚
โ”‚ rand32 โ”‚ 0 โ”‚ 0 โ”‚ rand โ”‚ โ”‚ System โ”‚ โ€ฆ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚
โ”‚ INET6_ATON โ”‚ 0 โ”‚ 1 โ”‚ IPv6StringToNum โ”‚ โ”‚ System โ”‚ โ€ฆ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚
โ”‚ INET_ATON โ”‚ 0 โ”‚ 1 โ”‚ IPv4StringToNum โ”‚ โ”‚ System โ”‚ โ€ฆ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚
โ”‚ truncate โ”‚ 0 โ”‚ 1 โ”‚ trunc โ”‚ โ”‚ System โ”‚ โ€ฆ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚
โ”‚ ceiling โ”‚ 0 โ”‚ 1 โ”‚ ceil โ”‚ โ”‚ System โ”‚ โ€ฆ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚
โ”‚ replace โ”‚ 0 โ”‚ 1 โ”‚ replaceAll โ”‚ โ”‚ System โ”‚ โ€ฆ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚
โ”‚ from_utc_timestamp โ”‚ 0 โ”‚ 1 โ”‚ fromUTCTimestamp โ”‚ โ”‚ System โ”‚ โ€ฆ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚
โ”‚ mapFromString โ”‚ 0 โ”‚ 0 โ”‚ extractKeyValuePairs โ”‚ โ”‚ System โ”‚ โ€ฆ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚
โ”‚ str_to_map โ”‚ 0 โ”‚ 1 โ”‚ extractKeyValuePairs โ”‚ โ”‚ System โ”‚ โ€ฆ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 10 rows 12 columns (11 shown) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```

#### Notes

> The reader is a clear room implementation of the ClickHouse Native file format using no code or libraries from ClickHouse Inc. As such it is potentially incomplete, imperfect and might not be compatible with all files. USE AT YOUR OWN RISK!

### ๐ŸŽ Performance
Simple CLI _cold start_ count() test using `duckdb` vs. `clickhouse-local` and 1M rows
#### DuckDB
```sql
# time duckdb -c "LOAD chsql_native; SELECT count(*) FROM clickhouse_native('/tmp/1M.clickhouse');"
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ count_star() โ”‚
โ”‚ int64 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 1000000 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

real 0m0.095s
user 0m0.077s
sys 0m0.029s
```
#### clickhouse-local
```sql
# time clickhouse local "SELECT count(*) FROM '/tmp/1M.clickhouse'";
1000000

real 0m0.141s
user 0m0.086s
sys 0m0.043s
```


### โ›‘๏ธ Extension Status
- [x] Basic Fomat Reading
- [x] Column Extraction
- [x] Blocks Parser & Iterator
- [x] Type Mapping WIP
- [x] Strings
- [x] Integers
- [x] Enums
- [ ] ??? as String
- [ ] Compression support
- [x] Basic Native Client
- [x] clickhouse-rs binding
- [x] TLS Support
- [x] Type Mapping WIP
- [x] Strings
- [x] Integers
- [ ] Everything Else


### โš™๏ธ Dev Build
You can easily modify the code and build a local extension for testing and development.

#### Requirements
- Rust

1) Clone and Compile the extension on your system

```bash
cd /usr/src
git clone --recurse-submodules https://github.com/quackscience/duckdb-extension-clickhouse-native
cd duckdb-extension-clickhouse-native
make configure && make
```

2) Download and Run DuckDB with -unsigned
```
wget https://github.com/duckdb/duckdb/releases/download/v1.1.3/duckdb_cli-linux-amd64.zip && unzip duckdb_cli-linux-amd64.zip
./duckdb -unsigned
```

3) Load your local extension build
```sql
D LOAD '/usr/src/duckdb-extension-clickhouse-native/build/debug/clickhouse_native.duckdb_extension';
```

----

###### Disclaimer
> DuckDB ยฎ is a trademark of DuckDB Foundation.
> ClickHouseยฎ is a trademark of ClickHouse Inc. All trademarks, service marks, and logos mentioned or depicted are the property of their respective owners. The use of any third-party trademarks, brand names, product names, and company names is purely informative or intended as parody and does not imply endorsement, affiliation, or association with the respective owners.