Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/garystafford/kafka-connect-msk-demo
For a series of posts on Amazon MSK, Amazon EKS, and Amazon EMR
https://github.com/garystafford/kafka-connect-msk-demo
aws kafka kafka-connect kubernetes spark spark-streaming
Last synced: about 2 months ago
JSON representation
For a series of posts on Amazon MSK, Amazon EKS, and Amazon EMR
- Host: GitHub
- URL: https://github.com/garystafford/kafka-connect-msk-demo
- Owner: garystafford
- License: mit
- Created: 2021-08-10T19:03:41.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-01-02T18:26:00.000Z (about 3 years ago)
- Last Synced: 2024-03-19T01:33:45.034Z (10 months ago)
- Topics: aws, kafka, kafka-connect, kubernetes, spark, spark-streaming
- Language: Python
- Homepage:
- Size: 12.4 MB
- Stars: 61
- Watchers: 4
- Forks: 27
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Exploring Data Lakes CDC, Apache Kafka, and Apache Hudi on AWS
Source code for the following series of blog posts:
* [Hydrating a Data Lake using Query-based CDC with Apache Kafka Connect and Kubernetes on AWS](https://itnext.io/hydrating-a-data-lake-using-query-based-cdc-with-apache-kafka-connect-and-kubernetes-on-aws-cd4725b58c2e)
* [Hydrating a Data Lake using Log-based Change Data Capture (CDC) with Debezium, Apicurio, and Kafka Connect on AWS](https://garystafford.medium.com/hydrating-a-data-lake-using-log-based-change-data-capture-cdc-with-debezium-apicurio-and-kafka-799671e0012f)
* [Getting Started with Spark Structured Streaming and Kafka on AWS using Amazon MSK and Amazon EMR](https://garystafford.medium.com/getting-started-with-spark-structured-streaming-and-kafka-on-aws-using-amazon-msk-and-amazon-emr-91b1f2ef0162)
* [Stream Processing with Apache Spark, Kafka, Avro, and Apicurio Registry on Amazon EMR and Amazon MSK](https://itnext.io/stream-processing-with-apache-spark-kafka-avro-and-apicurio-registry-on-amazon-emr-and-amazon-13080defa3be)
* [Working with Apache Avro files in Amazon S3](https://garystafford.medium.com/previewing-apache-avro-files-in-amazon-s3-98f41e98f656)
* [Building Open Data Lakes: Debezium, Apache Kafka, Hudi, Spark, and Hive on AWS](https://youtu.be/E1N0RuK1PLc)
* [The Art of Building Open Data Lakes with Apache Hudi, Kafka, Hive, and Debezium](https://garystafford.medium.com/the-art-of-building-open-data-lakes-with-apache-hudi-kafka-hive-and-debezium-3d2f71c5981f)---
The contents of this repository represent my viewpoints and not of my past or current employers, including Amazon Web
Services (AWS). All third-party libraries, modules, plugins, and SDKs are the property of their respective owners.