An open API service indexing awesome lists of open source software.

https://github.com/gamussa/flink-for-java-workshop

Apache Flink for Java Developers Workshop
https://github.com/gamussa/flink-for-java-workshop

confluent confluent-cloud flink-table

Last synced: 3 months ago
JSON representation

Apache Flink for Java Developers Workshop

Awesome Lists containing this project

README

          

= Apache Flink for Java Developers Workshop
Viktor Gamov , Sandon Jackobs
v1.0, 2025-02-17
:toc:

image:https://github.com/gAmUssA/flink-for-java-workshop/actions/workflows/smoke-test.yml/badge.svg[Smoke Test,link=https://github.com/gAmUssA/flink-for-java-workshop/actions/workflows/smoke-test.yml]
image:https://github.com/gAmUssA/flink-for-java-workshop/actions/workflows/ci.yml/badge.svg[CI,link=https://github.com/gAmUssA/flink-for-java-workshop/actions/workflows/ci.yml]

== 📚 Workshop Description & Learning Objectives

You were tasked with building a real-time platform, but traditional tools couldn’t handle the massive data streams, leading to lagging performance and frustrated users.
Deadlines were looming, and the team needed a breakthrough.

And when you learned about Apache Flink.

We will explore how the *DataStream API* allowed efficient real-time data processing and how the *Table API* and *SQL* features simplified complex queries with familiar syntax.
Testing became more straightforward, and managing the application state was no longer a headache.

You’ll learn:

- Harnessing the DataStream API: Process unbounded data streams efficiently to make your applications more responsive.
- Unlocking Table API and SQL: Use SQL queries within Flink to simplify data processing tasks without learning new languages.
- Effective Testing Strategies: Implement best practices for testing Flink applications to ensure your code is robust and reliable.
- Stateful Stream Processing: Manage application state effectively for complex event processing and real-time analytics.

By the end of this talk, you will be equipped to tackle real-time data challenges.
Whether you're building analytics dashboards, event-driven systems, or handling data streams, Apache Flink can be the game-changer you’ve been searching for.

== 💻 Technical Prerequisites

. *Basic Programming Knowledge* – Familiarity with **Java** or **Scala** (ugh) (Flink supports both but not for long).
. *Understanding of Stream Processing Concepts* – Awareness of real-time data pipelines, event-driven architectures, and streaming frameworks.
. *SQL Proficiency* – Basic understanding of SQL for working with Flink’s **Table API**.
. *Linux/macOS Command Line Experience* – Ability to execute terminal commands and navigate a Unix-like environment.

== 🔧 Required Software and Setup

=== 1️⃣ Docker & Docker Compose (Mandatory)

- *Why?* Flink and Kafka components will be containerized.
- *Download & Install:* https://www.docker.com/get-started[Docker Website]
- *macOS Alternative:* https://orbstack.dev/[OrbStack] (recommended for better performance)

=== 2️⃣ Confluent Cloud Account (Mandatory)

- *Why?* Required for Kafka-based streaming exercises.
- *Sign Up & Get API Keys:* https://www.confluent.io/confluent-cloud/[Confluent Cloud]

=== 3️⃣ Java 21 Installed

- *Why?* Apache Flink will run on Java 21.
- *Download Java 21 via SDKMAN:* https://sdkman.io[SDKMAN]
- *Verify Installation:* Run `java -version` in the terminal.

=== 4️⃣ Git Installed

- *Why?* For cloning repositories and working with project files.
- *Download Git:* https://git-scm.com/downloads[Git Website]

=== 5️⃣ IDE with Gradle Support

- *Why?* Recommended for Flink development.
- *Download IntelliJ IDEA (Recommended):* https://www.jetbrains.com/idea/download/[IntelliJ IDEA]
- *Download VS Code (Alternative):* https://code.visualstudio.com/download[VS Code]
- *Confluent Extension for VSCode:* https://marketplace.visualstudio.com/items?itemName=confluentinc.vscode-confluent[https://marketplace.visualstudio.com/items?itemName=confluentinc.confluent-vscode][link]

=== 6️⃣ Gradle Installed via Wrapper (No Need for Local Installation)

- *Why?* The workshop will use the **Gradle Wrapper**, eliminating the need for manual installation.
- *Gradle Docs:* https://docs.gradle.org/current/userguide/gradle_wrapper.html[Gradle Wrapper Documentation]

=== 7️⃣ Quick Setup for macOS Users with Homebrew

- *Why?* Simplifies the installation of all required dependencies.
- *How?* This project includes a `Brewfile` for managing dependencies.
- *Setup:* Run `make setup-mac` to install all required dependencies using Homebrew.
- *Update:* Run `make update-brew-deps` to update dependencies.

== 🌐 Network & System Requirements

. *Stable Internet Connection* – Required for downloading dependencies and connecting to Confluent Cloud.
. *8GB+ RAM Recommended* – Running Flink, and other services may require significant memory.
. *Sufficient Disk Space (At Least 10GB Free)* – For Docker images, logs, and data processing.

== ⚙️ Optional but Recommended

=== 1️⃣ Terraform Installed

- *Why?* Useful for automated infrastructure setup in Confluent Cloud.
- *Download Terraform:* https://developer.hashicorp.com/terraform/downloads[Terraform Website]

=== 2️⃣ Basic Understanding of Terraform and IaC (Infrastructure as Code)

- *Why?* If Terraform scripts are used, a fundamental knowledge of how it works would be beneficial.
- *Terraform Getting Started Guide:* https://developer.hashicorp.com/terraform/tutorials[Terraform Tutorials]

=== 3️⃣ Confluent CLI

- *Why?* The workshop will use the commands in the Confluent CLI to get useful information about new Confluent infrastructure.
- *Download and Install:* https://docs.confluent.io/confluent-cli/current/install.html[Confluent CLI Installation Instructions]

=== 4️⃣ jq

- *Why?* The workshop will use jq to build configuration files used to demonstrate the Confluent Flink Table API.
- *Download and Install:* https://jqlang.org/download/[jq Download Instructions]

== 📌 Pre-Workshop Setup Tasks

. *Sign up for Confluent Cloud & Configure API Keys* – Ensure access credentials are available before the workshop.
. *Clone the Workshop Repository* – The repo will include pre-built examples and configuration files (GitHub link will be shared before the workshop).
. *Set Up Environment Variables* – Configure `JAVA_HOME` and authentication variables for Confluent Cloud.
. *Run a Simple Docker-Based Flink Job* – Validate that the environment is correctly configured.