https://github.com/databendcloud/flink-connector-databend

Flink SQL connector for Databend.
https://github.com/databendcloud/flink-connector-databend

databend flink flinkcdc

Last synced: 4 months ago
JSON representation

Flink SQL connector for Databend.

Host: GitHub
URL: https://github.com/databendcloud/flink-connector-databend
Owner: databendcloud
License: apache-2.0
Created: 2023-02-18T02:19:58.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2025-08-22T14:06:53.000Z (10 months ago)
Last Synced: 2025-08-22T14:21:10.200Z (10 months ago)
Topics: databend, flink, flinkcdc
Language: Java
Homepage:
Size: 157 KB
Stars: 4
Watchers: 5
Forks: 1
Open Issues: 9
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Flink Databend Connector

[Flink](https://github.com/apache/flink) SQL connector

for [Databend](https://github.com/datafuselabs/databend) database, this project Powered

by [Databend JDBC](https://github.com/databendcloud/databend-jdbc).

Currently, the project supports `Sink Table`.  

Please create issues if you encounter bugs and any help for the project is greatly appreciated.

## Connector Options

| Option               | Required | Default | Type     | Description                                                                                                              |

|:---------------------|:---------|:--------|:---------|:-------------------------------------------------------------------------------------------------------------------------|

| url                  | required | none    | String   | The Databend jdbc url in format `databend://:`.                                                              |

| username             | optional | none    | String   | The 'username' and 'password' must both be specified if any of them is specified.                                        |

| password             | optional | none    | String   | The Databend password.                                                                                                   |

| database-name        | optional | default | String   | The Databend database name.                                                                                              |

| table-name           | required | none    | String   | The Databend table name.                                                                                                 |

| sink.batch-size      | optional | 1000    | Integer  | The max flush size, over this will flush data.                                                                           |

| sink.flush-interval  | optional | 1s      | Duration | Over this flush interval mills, asynchronous threads will flush data.                                                    |

| sink.max-retries     | optional | 3       | Integer  | The max retry times when writing records to the database failed.                                                         |

| sink.update-strategy | optional | update  | String   | Convert a record of type UPDATE_AFTER to update/insert statement or just discard it, available: update, insert, discard. |

| sink.ignore-delete   | optional | true    | String   | handle DELETE event or not                                                                                               |

| sink.primary-key     | optional | "id"    | String   | The primary key used in upsert                                                                                           |

**NOTE**: `sink.ignore-delete` default value is true so this connector doesn't support `DELETE` default. If you make `sink.ignore-delete=false` make sure the source database has an `Integer` or `String` type primary key.

**Upsert Data Considerations:**

## Data Type Mapping

| Flink Type          | Databend Type 
|:--------------------|:------ 
| CHAR                | String 
| VARCHAR             | String 
| STRING              | String 
| BOOLEAN             | Boolean 
| BYTES               | String 
| DECIMAL 
| TINYINT             | Int8 
| SMALLINT            | Int16 / UInt8 
| INTEGER 
| BIGINT              | Int64 / UInt32 
| FLOAT               | Float 
| DOUBLE              | Double 
| DATE                | Date 
| TIME                | DateTime 
| TIMESTAMP           | DateTime 
| TIMESTAMP_LTZ       | DateTime 
| INTERVAL_YEAR_MONTH | Int32 
| INTERVAL_DAY_TIME   | Int64 
| ARRAY               | Array 
| MAP                 | Map 
| ROW                 | Not supported 
| MULTISET            | Not supported 
| RAW                 | Not supported

| -------------------------------------------------| | | | | | | Decimal / Int128 / Int256 / UInt64 / UInt128 / UInt256 | | | | Int32 / UInt16 / Interval                              | | | | | | | | | | | | | | |

## Maven Dependency

The project isn't published to the maven central repository, we need to deploy/install to our own

repository before use it, step as follows:

```bash

# clone the project

git clone https://github.com/databendcloud/flink-connector-databend.git

# enter the project directory

cd flink-connector-databend/

# display remote branches

git branch -r

# checkout the branch you need

git checkout $branch_name

# install or deploy the project to our own repository

mvn clean install -DskipTests

mvn clean deploy -DskipTests

```

```xml

    org.apache.flink

    flink-connector-databend

    1.0.0-SNAPSHOT

```

## How to use

### Create and read/write table

```SQL

-- register a databend table `t_user` in flink sql.

CREATE TABLE t_user

(

    `user_id`   BIGINT,

    `user_type` INTEGER,

    `language`  STRING,

    `country`   STRING,

    `gender`    STRING,

    `score`     DOUBLE,

    `list`      ARRAY,

    `map`       Map,

    PRIMARY KEY (user_id)

) WITH (

      'connector' = 'databend',

      'url' = 'databend://{ip}:{port}',

      'database-name' = 'default',

      'table-name' = 'users',

      'sink.batch-size' = '500',

      'sink.flush-interval' = '1000',

      'sink.ignore-delete' = 'false',

      'sink.max-retries' = '3'

      );

-- read data from databend 

SELECT user_id, user_type

from t_user;

-- write data into the databend table from the table `T`

INSERT INTO t_user

SELECT cast(`user_id` as BIGINT),

       `user_type`,

       `lang`,

       `country`,

       `gender`,

       `score`,

       ARRAY['CODER',

       'SPORTSMAN'], CAST(MAP['BABA', cast(10 as BIGINT), 'NIO', cast(8 as BIGINT)] AS MAP)

FROM T;

```

### Create and use DatabendCatalog

#### Scala

```scala

val tEnv = TableEnvironment.create(setting)

val props = new util.HashMap[String, String]()

props.put(DatabendConfig.DATABASE_NAME, "default")

props.put(DatabendConfig.URL, "databend://127.0.0.1:8000")

props.put(DatabendConfig.USERNAME, "username")

props.put(DatabendConfig.PASSWORD, "password")

props.put(DatabendConfig.SINK_FLUSH_INTERVAL, "30s")

val cHcatalog = new DatabendConfig("databend", props)

tEnv.registerCatalog("databend", datbendcatalog)

tEnv.useCatalog("databend")

tEnv.executeSql("insert into `databend`.`default`.`t_table` select...");

```

#### Java

```java

TableEnvironment tEnv=TableEnvironment.create(setting);

        Map props=new HashMap<>();

        props.put(DatabendConfig.DATABASE_NAME,"default")

        props.put(DatabendConfig.URL,"databend://127.0.0.1:8000")

        props.put(DatabendConfig.USERNAME,"username")

        props.put(DatabendConfig.PASSWORD,"password")

        props.put(DatabendConfig.SINK_FLUSH_INTERVAL,"30s");

        Catalog cHcatalog=new DatabendConfig("databend",props);

        tEnv.registerCatalog("databend",databendcatalog);

        tEnv.useCatalog("databend");

        tEnv.executeSql("insert into `databend`.`default`.`t_table` select...");

```

## Roadmap

- [x] Implement the Flink SQL Sink function.

- [x] Support DatabendCatalog.

- [x] Implement the Flink SQL Source function.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/databendcloud/flink-connector-databend

Awesome Lists containing this project

README