{"id":35867857,"url":"https://github.com/databendcloud/bend-ingest-kafka","last_synced_at":"2026-01-20T10:06:20.418Z","repository":{"id":109889309,"uuid":"609010258","full_name":"databendcloud/bend-ingest-kafka","owner":"databendcloud","description":"Ingest kafka data into databend ","archived":false,"fork":false,"pushed_at":"2024-05-15T07:04:51.000Z","size":54,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":6,"default_branch":"main","last_synced_at":"2024-05-16T12:52:53.888Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/databendcloud.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-03-03T07:23:31.000Z","updated_at":"2024-05-30T06:15:36.405Z","dependencies_parsed_at":"2024-05-30T06:15:22.806Z","dependency_job_id":null,"html_url":"https://github.com/databendcloud/bend-ingest-kafka","commit_stats":null,"previous_names":[],"tags_count":27,"template":false,"template_full_name":null,"purl":"pkg:github/databendcloud/bend-ingest-kafka","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databendcloud%2Fbend-ingest-kafka","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databendcloud%2Fbend-ingest-kafka/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databendcloud%2Fbend-ingest-kafka/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databendcloud%2Fbend-ingest-kafka/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/databendcloud","download_url":"https://codeload.github.com/databendcloud/bend-ingest-kafka/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databendcloud%2Fbend-ingest-kafka/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28601320,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-20T09:39:28.479Z","status":"ssl_error","status_checked_at":"2026-01-20T09:38:10.511Z","response_time":117,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-08T14:13:55.344Z","updated_at":"2026-01-20T10:06:20.411Z","avatar_url":"https://github.com/databendcloud.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# bend-ingest-kafka\n\nIngest kafka data into databend\n\n# Installation\n\n```shell\ngo install  github.com/databendcloud/bend-ingest-kafka@latest\n```\n\nOr download the binary from the [release page](https://github.com/databendcloud/bend-ingest-kafka/releases).\n\n```go\nbend-ingest-kafka --version\n```\n\n# Usage\n\n## Json transform mode\n\nThe json transform mode is the default mode which will transform the kafka data into databend table, you can use it by setting the `--is-json-transform` to `true`.\n### Create a table according your kafka data structrue\nFor example, the kafka data like \n\n```json\n{\"i64\": 10,\"u64\": 30,\"f64\": 20,\"s\": \"hao\",\"s2\": \"hello\",\"a16\":[1],\"a8\":[2],\"d\": \"2011-03-06\",\"t\": \"2016-04-04 11:30:00\"}\n```\n\nyou should create a table using \n\n``` SQL\nCREATE TABLE test_ingest (\n\t\t\ti64 Int64,\n\t\t\tu64 UInt64,\n\t\t\tf64 Float64,\n\t\t\ts   String,\n\t\t\ts2  String,\n\t\t\ta16 Array(Int16),\n\t\t\ta8  Array(UInt8),\n\t\t\td   Date,\n\t\t\tt   DateTime);\n```\n      \n### execute bend-ingest-kafka\n\n#### command line mode\n```shell\nbend-ingest-kafka\n  --kafka-bootstrap-servers=\"127.0.0.1:9092,127.0.0.2:9092\"\\\n  --kafka-topic=\"Your Topic\"\\\n  --kafka-consumer-group= \"Consumer Group\"\\\n  --databend-dsn=\"databend://user:password@localhost:8000/default?sslmode=disable\"\\\n  --databend-table=\"db1.tbl\" \\\n  --data-format=\"json\" \\\n  --batch-size=100000 \\\n  --batch-max-interval=300\n```\n\n#### config file mode\nConfig the config file `config/conf.json`\n```json\n{\n  \"kafkaBootstrapServers\": \"localhost:9092\",\n  \"kafkaTopic\": \"ingest_test\",\n  \"KafkaConsumerGroup\": \"test\",\n  \"isJsonTransform\": true,\n  \"databendDSN\": \"databend://user:password@localhost:8000/default?sslmode=disable\",\n  \"databendTable\": \"default.kfk_test\",\n  \"batchSize\": 1,\n  \"batchMaxInterval\": 5,\n  \"dataFormat\": \"json\",\n  \"workers\": 1\n}\n```\n\nand execute the command\n```shell\n./bend-ingest-kafka \n```\n\n## Raw mode\nThe raw mode is used to ingest the raw data into databend table, you can use it by setting the `isJsonTransform` to `false`.\nIn this mode, we will create a table with the name `databendTable` which columns are `(uuid, koffset,kpartition, raw_data, record_metadata, add_time)` and ingest the raw data into this table.\nThe `record_metadata` is the metadata of the kafka record which contains the `topic`, `partition`, `offset`, `create_time`, `key`, and the `add_time` is the time when the record is added into databend.\n\n### Example\nIf the kafka json data is:\n```json\n{\"i64\": 10,\"u64\": 30,\"f64\": 20,\"s\": \"hao\",\"s2\": \"hello\",\"a16\":[1],\"a8\":[2],\"d\": \"2011-03-06\",\"t\": \"2016-04-04 11:30:00\"}\n```\nrun the command\n```shell\n./bend-ingest-kafka \n```\n\nwith `config/conf.json` and the table `default.kfk_test` will be created and the data will be ingested into this table.\n\n![](https://files.mdnice.com/user/4760/2e8b0267-5694-43b5-9992-316280b4594f.png)\n\n\n## Parameter References\n| Parameter             | Description               | Default           | example                         |\n|-----------------------|---------------------------|-------------------|---------------------------------|\n| kafkaBootstrapServers | kafka bootstrap servers   | \"127.0.0.1:64103\" | \"127.0.0.1:9092,127.0.0.2:9092\" |\n| kafkaTopic            | kafka topic               | \"test\"            | \"test\"                          |\n| KafkaConsumerGroup    | kafka consumer group      | \"kafka-bend-ingest\" | \"test\"                          |\n|isSASL                 | is sasl                   | false             | true                            |\n|saslUser               | sasl user                 | \"\"                | \"user\"                          |\n|saslPassword           | sasl password             | \"\"                | \"password\"                      |\n| mockData              | mock data                 | \"\"                | \"\"                              |\n| isJsonTransform       | is json transform         | true              | true                            |\n| databendDSN           | databend dsn              | no                | \"databend://user:password@localhost:8000/default?sslmode=disable\"         |\n| databendTable         | databend table            | no                | \"db1.tbl\"                       |\n| batchSize             | batch size                | 1000              | 1000                            |\n| batchMaxInterval      | batch max interval (seconds)       | 30                  | 30                              |\n| dataFormat            | data format               | json              | \"json\"                          |\n| workers               | workers thread number     | 1                 | 1                               |\n| copyPurge             | copy purge                | false             | false                           |\n| copyForce             | copy force                | false             | false                           |\n| DisableVariantCheck   | disable variant check     | false             | false                           |\n| MinBytes              | min bytes                 | 1024              | 1024                            |\n| MaxBytes              | max bytes                 | 1048576           | 1048576                         |\n| MaxWait               | max wait time (seconds)   | 10                | 10                              |\n| useReplaceMode       | use replace mode          | false             | false                           |\n| userStage             | user external stage name  | ~                 | ~                               |\n| maxRetryDelay         | max retry delay (seconds) | 1800              | 1800                            |\n\n**NOTE:**\n- The `copyPurge and copyForce` are used to delete the data in the target table before ingesting the data. More details please refer to [copy](https://docs.databend.com/sql/sql-commands/dml/dml-copy-into-table#copy-options).\n- The `useReplaceMode` is used to replace the data in the table, if the data already exists in the table, the new data will replace the old data. But the `useReplaceMode` is only supported when `isJsonTransform` false because it needs to add `koffset` and `kpartition` field in the target table.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatabendcloud%2Fbend-ingest-kafka","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdatabendcloud%2Fbend-ingest-kafka","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatabendcloud%2Fbend-ingest-kafka/lists"}