https://github.com/sanjuthomas/kafka-connect-gcp-bigtable
Kafka Sink Connect to GCP Bigtable - https://www.confluent.io/hub/sanjuthomas/kafka-connect-gcp-bigtable
https://github.com/sanjuthomas/kafka-connect-gcp-bigtable
bigtable cloudbigtable gcp kafka kafka-connect kafka-connector kafka-sink
Last synced: 5 months ago
JSON representation
Kafka Sink Connect to GCP Bigtable - https://www.confluent.io/hub/sanjuthomas/kafka-connect-gcp-bigtable
- Host: GitHub
- URL: https://github.com/sanjuthomas/kafka-connect-gcp-bigtable
- Owner: sanjuthomas
- License: mit
- Created: 2018-12-28T16:53:45.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2024-06-30T23:52:01.000Z (10 months ago)
- Last Synced: 2024-07-05T05:21:52.272Z (10 months ago)
- Topics: bigtable, cloudbigtable, gcp, kafka, kafka-connect, kafka-connector, kafka-sink
- Language: Java
- Homepage: http://sanjuthomas.com
- Size: 506 KB
- Stars: 7
- Watchers: 3
- Forks: 8
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[](https://codecov.io/gh/sanjuthomas/kafka-connect-gcp-bigtable)
[](https://www.codacy.com/manual/sanjuthomas/kafka-connect-gcp-bigtable?utm_source=github.com&utm_medium=referral&utm_content=sanjuthomas/kafka-connect-gcp-bigtable&utm_campaign=Badge_Grade)
[](https://codeclimate.com/github/sanjuthomas/kafka-connect-gcp-bigtable/maintainability)
[](https://codebeat.co/projects/github-com-sanjuthomas-kafka-connect-gcp-bigtable-master)
[](https://maven-badges.herokuapp.com/maven-central/com.sanjuthomas/kafka-connect-gcp-bigtable)
[](https://bettercodehub.com/)# Kafka Sink Connect Google Cloud (GCP) Bigtable
Apache Kafka Sink Only Connect can stream messages from Apache Kafka to Google Cloud Platform (GCP) wide column store Bigtable.
## What is Apache Kafka
Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation and written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency platform for real-time data feeds. Please look at [Apache Kafka home page](https://kafka.apache.org/).
## What is Google Cloud Bigtable
Bigtable is a compressed, high performance, proprietary data storage system built on Google File System, Chubby Lock Service, SSTable and a few other Google technologies. On May 6, 2015, a public version of Bigtable was made available as a service in the Google Cloud Platform. For more details, please refer to [GCP Bigtable home page](https://cloud.google.com/bigtable/).
## High Level Architecture
This project leverages [bigtable-client-core](https://mvnrepository.com/artifact/com.google.cloud.bigtable/bigtable-client-core) library (NO HBase) to stream data to GCP Bigtable. [bigtable-client-core](https://mvnrepository.com/artifact/com.google.cloud.bigtable/bigtable-client-core) internally use the [gRPC](https://grpc.io/) framework to talk to GCP Bigtable.

## Prerequisites
You have [Apache ZooKeeper](https://zookeeper.apache.org) and [Apache Kafka](https://kafka.apache.org) installed and running on your computer. Please refer to the respective sites to download and start ZooKeeper and Kafka. You also need Java version 11 or above.
### Tested Software Versions
| Software | Version | Note |
| ------------- |---------------| ------------------------------------- |
| Java | 11 | Tested using Java 11. |
| Kafka | 3.3.1 | Please [refer](https://kafka.apache.org/downloads). Tested using kafka_2.13-3.3.1, should work with older versions. |
| bigtable-client-core | 1.27.1 | Please [refer](https://mvnrepository.com/artifact/com.google.cloud.bigtable/bigtable-client-core/1.27.1). |
| Kafka connect-api | 3.3.1 | Please [refer](https://mvnrepository.com/artifact/org.apache.kafka/connect-api/3.3.1). |
| grpc-netty-shaded | 1.51.0 | Please [refer](https://mvnrepository.com/artifact/io.grpc/grpc-netty-shaded/1.51.0). |## Configurations
Please refer to project [Wiki](https://github.com/sanjuthomas/kafka-connect-gcp-bigtable/wiki/Kafka-Connect-GCP-Bigtable-sink-configurations)
### ConstraintsThe current configuration system supports streaming messages from a given topic to a table. You can subscribe to any number of topics, but a topic can be pointed to one and only one table. Say, for example, if you subscribed from a topic named demo-topic, you should have a yml file named demo-topic.yml. That yml file contains all the configuration required to transform and write data into Bigtable.
## How to build the artifact
Please refer to project [Wiki](https://github.com/sanjuthomas/kafka-connect-gcp-bigtable/wiki/How-to-build-the-Kafka-Connect-GCP-Bigtable%3F)
## How to deploy the connector
Please refer to project [Wiki](https://github.com/sanjuthomas/kafka-connect-gcp-bigtable/wiki/How-to-deploy-the-Kafka-Connect-GCP-Bigtable-and-verify-the-deployment%3F)
## How to start the connector in stand-alone mode
Please refer to project [Wiki](https://github.com/sanjuthomas/kafka-connect-gcp-bigtable/wiki/How-to-start-the-Kafka-Sink-Connect-GCP-Bigtable%3F)
## Questions
Either create issues in this project or send it to [email protected]. Thanks!
## License
[](https://app.fossa.io/projects/git%2Bgithub.com%2Fsanjuthomas%2Fkafka-connect-gcp-bigtable?ref=badge_large)