An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with apache-flink

A curated list of projects in awesome lists tagged with apache-flink .

https://github.com/ververica/flink-sql-cookbook

The Apache Flink SQL Cookbook is a curated collection of examples, patterns, and use cases of Apache Flink SQL. Many of the recipes are completely self-contained and can be run in Ververica Platform as is.

apache-flink flink flink-sql sql stream-processing

Last synced: 18 Feb 2026

https://github.com/GoogleCloudPlatform/flink-on-k8s-operator

[DEPRECATED] Kubernetes operator for managing the lifecycle of Apache Flink and Beam applications.

apache-beam apache-flink flink-operator google-cloud-dataproc kubernetes kubernetes-operator operator

Last synced: 23 Mar 2025

https://github.com/googlecloudplatform/flink-on-k8s-operator

[DEPRECATED] Kubernetes operator for managing the lifecycle of Apache Flink and Beam applications.

apache-beam apache-flink flink-operator google-cloud-dataproc kubernetes kubernetes-operator operator

Last synced: 03 Oct 2025

https://github.com/alibaba/feathub

FeatHub - A stream-batch unified feature store for real-time machine learning

apache-flink data data-engineering data-quality data-science feature-engineering feature-store machine-learning mlops streaming

Last synced: 14 Oct 2025

https://github.com/raystack/dagger

Dagger is an easy-to-use, configuration over code, cloud-native framework built on top of Apache Flink for stateful processing of real-time streaming data.

apache-flink apache-kafka dataops framework influxdb prometheus real-time-analytics real-time-processing stream-processing

Last synced: 06 Apr 2025

https://github.com/knaufk/flink-faker

A data generator source connector for Flink SQL based on data-faker.

apache-flink flink flink-sql

Last synced: 04 Jan 2026

https://github.com/spotify/flink-on-k8s-operator

Kubernetes operator for managing the lifecycle of Apache Flink and Beam applications.

apache-beam apache-flink flink flink-operator kubernetes kubernetes-operator

Last synced: 15 May 2025

https://github.com/Chabane/bigdata-playground

A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL

angular apache-flink apache-spark avro big-data docker graphql hadoop hbase kafka kops machine-learning mongodb nodejs parquet python scala spark-sql spark-streaming twitter-api

Last synced: 28 Apr 2025

https://github.com/chabane/bigdata-playground

A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL

angular apache-flink apache-spark avro big-data docker graphql hadoop hbase kafka kops machine-learning mongodb nodejs parquet python scala spark-sql spark-streaming twitter-api

Last synced: 13 Apr 2025

https://github.com/ing-bank/flink-deployer

A tool that help automate deployment to an Apache Flink cluster

apache-flink deployment docker flink go golang

Last synced: 14 Apr 2025

https://github.com/getindata/dbt-flink-adapter

Adapter for dbt that executes dbt pipelines on Apache Flink

apache-flink data-streaming dbt streaming-analytics

Last synced: 07 May 2025

https://github.com/seznam/euphoria

Euphoria is an open source Java API for creating unified big-data processing flows. It provides an engine independent programming model which can express both batch and stream transformations.

apache-flink apache-spark batch-processing big-data hadoop hdfs java-api kafka streaming-data unified-bigdata-processing

Last synced: 21 Aug 2025

https://github.com/twalthr/flink-api-examples

Examples for using Apache Flink® with DataStream API, Table API, Flink SQL and connectors such as MySQL, JDBC, CDC, Kafka.

apache-flink data-engineering flink flink-examples flink-sql stream-processing

Last synced: 24 Jun 2026

https://github.com/airscholar/flinkcommerce

This repository contains an Apache Flink application for real-time sales analytics built using Docker Compose to orchestrate the necessary infrastructure components, including Apache Flink, Elasticsearch, and Postgres

apache-flink big-data big-data-processing python realtime-streaming

Last synced: 08 Oct 2025

https://github.com/ververica/jupyter-vvp

Jupyter Integration for Flink SQL via Ververica Platform

apache-flink flink jupyter

Last synced: 03 May 2025

https://github.com/raycad/stream-processing

Stream processing guidelines and examples using Apache Flink and Apache Spark

apache-flink apache-spark batch-processing data-analysis streaming

Last synced: 13 Jul 2025

https://github.com/build-on-aws/prioritizing-event-processing-with-apache-kafka

Technical solution to implement event processing prioritization with Apache Kafke using the concept of buckets.

amazon-msk amazon-msk-connect apache-flink apache-kafka event-processing kafka-connect kafka-streams

Last synced: 06 May 2025

https://github.com/garystafford/streaming-sales-generator

Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python

analytics apache-flink apache-kafka data kafka kafka-streams kstreams python spark-structured-streaming streaming-data

Last synced: 03 Aug 2025

https://github.com/sanderploegsma/flink-k8s

Example Apache Flink cluster on Kubernetes

apache-flink kubernetes

Last synced: 29 Apr 2025

https://github.com/brakmic/twitterflink

A simple Twitter-Streaming Application for Apache Flink

apache-flink bigdata jvm scala

Last synced: 12 Jun 2025

https://github.com/cyberdelia/flink-kotlin

Kotlin support for Apache Flink

apache-flink flink kotlin serialization

Last synced: 18 Jan 2026

https://github.com/build-on-aws/real-time-gaming-leaderboard-apache-flink

Example gaming leaderboard application covering streaming ingestion, CDC enrichment, processing and visualisation including demo of advance real-time analytics concepts like late data arrival, exactly-once, dynamic config, archival and on-demand replay

amazon-managed-flink apache-flink change-data-capture dynamic-config exactly-once real-time-analytics replay

Last synced: 12 Mar 2026

https://github.com/confluentinc/flink-table-api-python-examples

Python Examples for running Apache Flink® Table API on Confluent Cloud

apache-flink confluent confluent-cloud flink-sql stream-processing table-api

Last synced: 24 Jun 2026

https://github.com/fabricalab/streaming-flink-file-source

File source operator for Apache Flink

apache-flink flink streaming

Last synced: 14 Jan 2026

https://github.com/ishaanadarsh/medstream-analytics

The project leverages Apache Flink, Apache Kafka and Python digital Twin to provide real-time insights into healthcare data, enabling timely interventions and proactive patient care.

analytics apache-flink apache-kafka complex-event-processing digital-twin stream-processing streaming

Last synced: 11 Mar 2026

https://github.com/decodableco/dbt-decodable

A dbt adapter for Decodable

apache-flink dbt flink sql stream-processing

Last synced: 09 Apr 2026

https://github.com/airscholar/apacheflink-salesanalytics

This repository contains an end-to-end data engineering project using Apache Flink, focused on performing sales analytics. The project demonstrates how to ingest, process, and analyze sales data, showcasing the capabilities of Apache Flink for big data processing.

apache-flink data-engineering end-to-end-data-engineering sales-analytics

Last synced: 10 Apr 2025

https://github.com/fiware/tutorials.big-data-flink

:blue_book: FIWARE 305: Real-time Processing of Context Data using Apache Flink

apache-flink big-data-analytics fiware fiware-cosmos flink orion-flink-connector tutorial

Last synced: 30 Apr 2025

https://github.com/isopropylcyanide/flink-couchbase-data-sink

A Flink job that reads a Json file (either one-time or continous poll) as its source and dumps it to couchbase as a sink using the asynchronous Couchbase SDK.

apache-flink couchbase couchbase-server couchbase-sink database-store flink flink-examples flink-job

Last synced: 03 May 2025

https://github.com/gordonmurray/apache_flink_and_iceberg

Using Apache Flink to write to s3 in Apache Iceberg format

apache-flink apache-iceberg parquet s3

Last synced: 12 Apr 2025

https://github.com/gordonmurray/apache_flink_and_docker_compose

Running Apache Flink containers using Docker Compose is a convenient way to get up and running to try out some Flink workloads.

apache-flink cdc mariadb redis

Last synced: 30 Jul 2025

https://github.com/gordonmurray/apache_flink_and_paimon

Trying out Apache Paimon with Apache Flink using Docker Compose

apache-flink orc paimon s3

Last synced: 24 Aug 2025

https://github.com/gamussa/uncorking-analytics-with-pinot-kafka-flink

In this full-day training, we will explore the architectures of Apache Kafka, Apache Flink, and Apache Pinot. We will run local clusters of each system, studying the role each plays in a real-time analytics pipeline.

apache-flink apache-pinot kafka

Last synced: 14 Apr 2025

https://github.com/garystafford/flink-kafka-demo

Apache Flink/Apache Kafka streaming data analytics demonstration using Streaming Synthetic Sales Data Generator

analytics apache-flink apache-kafka flink kafka streaming-data

Last synced: 10 Oct 2025

https://github.com/gordonmurray/apache_flink_paimon_and_seatunnel

Trying out Apache Flink with Apache Paimon and Apache SeaTunnel

apache-flink paimon seatunnel

Last synced: 12 Apr 2025

https://github.com/archisman-mridha/instagram-clone

Demonstrating fault tolerant distributed systems by building a battle tested Instagram Clone | For educational purposes only

apache-flink apache-kafka aws cloudnative debezium distributed-systems gitops golang graphql grpc kcl kubernetes nextjs nix observability rust supplychain-security

Last synced: 04 Sep 2025

https://github.com/twalthr/flink-project-template

Template for an Apache Flink project.

apache-flink flink flink-examples

Last synced: 24 Jun 2026

https://github.com/ren294/smarttraffic_lakehouse_for_hcmc

A Smart Traffic Management System for Ho Chi Minh City, Vietnam leveraging batch and real-time data processing, intuitive dashboards, and monitoring tools to optimize traffic flow, enhance safety, and support sustainable urban mobility through advanced analytics and user-friendly applications.

apache-airflow apache-flink apache-hive apache-hudi apache-kafka apache-nifi apache-spark apache-superset apache-zookeeper big-data debezium grafana lakefs metabase minio promotheus redis seatunnel streamlit trino

Last synced: 11 Apr 2025

https://github.com/vojay-dev/flitch

Let's learn about Apache Flink and sentiment analysis by building a real-time sentiment analysis streaming application for the Twitch chat.

apache-flink flink nlp sentiment-analysis stream-processing twitch

Last synced: 20 Sep 2025

https://github.com/emptyless/flinksql-mlflow

A flinksql-mlflow-pytorch implementation

apache-flink apache-kafka mlflow

Last synced: 09 Oct 2025

https://github.com/ev2900/flink_late_arriving_date_event_order

Helps explain how Flink handles late arriving data and the effects on message order

apache-flink flink flink-sql

Last synced: 06 Apr 2025

https://github.com/guru107/flinkjobuploadplugin

A maven plugin that will submit the created fat-jar to the apache flink cluster

apache-flink java maven-plugin

Last synced: 23 Apr 2025

https://github.com/idealista/flink_role

Ansible role to install Apache Flink

ansible-role apache-flink flink

Last synced: 28 Apr 2025

https://github.com/yuhexiong/kafka-data-pipeline-structured-flink-java

Data pipeline from Kafka to Kafka, Doris and Doris to Kafka using Flink Java.

apache-doris apache-flink apache-kafka doris flink flink-stream-processing kafka

Last synced: 09 Feb 2026

https://github.com/lmdcma27/dataexpert.io-boot-camp

Here i share my practices and solutions for the dataExpert BootCamp. You can follow it in the repo: https://github.com/DataExpert-io/data-engineer-handbook/tree/main/bootcamp/materials

analytical-patterns apache-flink apache-spark data-modeling data-quality-patterns unit-testing-pipelines

Last synced: 25 Feb 2026

https://github.com/abdelhakim-gh/bigdata_project

This project aims to establish a data streaming pipeline with storage, processing, and visualization

apache-flink apache-hadoop apache-kafka elasticsearch github-api kibana python

Last synced: 08 Feb 2026

https://github.com/kczulko/apache-flink-standalone-docker-cluster

Apache Flink standalone cluster based on docker containers

apache-flink docker

Last synced: 10 May 2026

https://github.com/gamussa/one-does-not-simply-query-a-stream

WIP demos and code examples for how to query a stream talk

apache-flink apache-pinot kafka-streams risingwave

Last synced: 27 Mar 2025

https://github.com/factorhouse/factor-telemetry

Ready-to-use observability dashboards and telemetry integrations for Apache Kafka and Apache Flink, powered by high-fidelity metrics from Factor House.

apache-flink apache-kafka factorhouse flex flink grafana grafana-dashboards kafka kpow monitoring observability openmetrics prometheus telemetry

Last synced: 12 Jun 2026

https://github.com/raffy23/tdrive-stream-processing

Stream processing of the T-Drive trajectory data sample

apache-flink apache-kafka lwjgl3 scala scalajs

Last synced: 30 Jul 2025

https://github.com/jinsyin/flink-connector-mongo

Flink Connector for the MongoDB

apache-flink connector flink mongo mongodb

Last synced: 15 Apr 2026

https://github.com/j3-signalroom/ccaf-housekeeping-python_lib

The CCAF Housekeeping Python Library is a CI/CD support tool designed to automate the teardown of a Flink table and its associated Kafka resources—such as topics and schemas—along with any long-running statements linked to it.

apache-flink confluent confluent-flink confluent-kafka confluent-schema-registry kafka

Last synced: 19 Aug 2025

https://github.com/danielavdar/apacheflinksimpleexample

Apache Flink simple example

apache-flink scala

Last synced: 28 Mar 2025

https://github.com/j3-signalroom/apache_flink-kickstarter

Examples of Apache Flink® applications showcasing the DataStream API and Table API in Java and Python, featuring AWS, GitHub, Terraform, and Apache Iceberg.

apache-flink apache-iceberg aws-glue aws-parameter-store aws-s3 aws-secrets-manager flink flink-examples flink-kafka flink-stream-processing github-actions iceberg snowflake streamlit-dashboard terraform-cloud

Last synced: 16 Mar 2025

https://github.com/gordonmurray/apache_flink_and_hudi

Using Apache Flink to store data in S3 using Apache Hudi

apache-flink apache-hudi parquet s3

Last synced: 12 Feb 2026

https://github.com/lapetitesouris/sensormetrics

Apache Flink Aggregation job

apache-flink flink stream-processing

Last synced: 03 Apr 2025

https://github.com/j3-signalroom/supercharge_streamlit-apache_flink

Engaging, interactive visualizations crafted with Streamlit, seamlessly powered by Apache Flink in batch mode to reveal deep insights from data.

apache-flink apache-iceberg aws-glue-data-catalog flink flink-sql iceberg kafka pyflink streamlit streamlit-dashboard

Last synced: 22 May 2026

https://github.com/j3-signalroom/apache_flink-kickstarter-ii

Apache Flink Kickstarter II (2026) showcases Flink 2.1.x through hands-on, production-focused examples on Confluent Platform + Minikube, with comparisons to Confluent Cloud. End-to-end pipelines in Java & Python bridge real-world streaming architecture.

apache-flink confluent-cloud confluent-platform data-engineering data-science flink minikube platform-engineering

Last synced: 12 Apr 2026

https://github.com/jaehyeon-kim/oml-digital-twin-hotrolling

A streaming Digital Twin of a steel hot rolling mill demonstrating Online Machine Learning (OML) with Apache Kafka, Apache Flink and MOA to handle real-time concept drift.

apache-flink apache-kafka concept-drift digital-twin discrete-event-simulation dynamic-des industry-4-0 kotlin massive-online-analysis moa online-machine-learning python stream-processing

Last synced: 05 Jun 2026

https://github.com/firoz-ahmad-likhon/kafka-flink-clickstream

Production-grade real-time streaming pipeline using Apache Kafka and Apache Flink to simulate real-world streaming workflows.

apache-flink apache-kafka clickstream flink kafka stream-processing streaming

Last synced: 01 May 2026

https://github.com/zablon-oigo/flink-streaming-elt-fluss

This project demonstrate a streaming ELT job from PostgreSQL to Fluss using Flink CDC, including full-database synchronization and schema change evolution

apache-flink apache-fluss docker-compose postgresql

Last synced: 23 Jun 2026

https://github.com/gordonmurray/tofu_aws_apache_flink

Using OpenTofu to create an Apache Flink cluster, with Task Managers in an auto scaling group of Spot instances to help reduce costs

ansible apache-flink opentofu

Last synced: 08 Jul 2025

https://github.com/elkoyote07/kafkaflinkplayground

A friendly playground for experimenting with Kafka and Flink integration and development.

apache-flink apache-kafka big-data docker docker-compose flink java kafka opensource python streams web-ui

Last synced: 05 May 2026

https://github.com/briandenicola/apache-flink-learnings

A repo to learn Apache Flink

apache-flink learn-by-doing

Last synced: 10 Mar 2026

https://github.com/tashi-2004/apache-flink-spark-data-streaming

This project showcases a real-time data streaming pipeline using Apache Flink, Apache Spark, and Grafana. It streams data, stores it in Parquet format, and performs aggregations for insights, with seamless visualization via Grafana dashboards.

apache-flink apache-spark data-aggregation data-analysis data-science data-streaming data-visualization flink flink-stream-processing flink-streaming grafana-dashboard grafana-plugin pyflink python3

Last synced: 09 Feb 2026

https://github.com/arthurmgraf/streamflow-analytics

Real-time Streaming Data Platform for E-commerce Fraud Detection — Kafka (Strimzi), Flink (PyFlink), Airflow, PostgreSQL (CloudNativePG), K3s, Terraform/Terragrunt. Medallion Architecture, 5 fraud rules, 85 tests, full CI/CD.

airflow apache-flink data-engineering fraud-detection kafka kubernetes machine-learning python real-time-analytics terraform

Last synced: 19 Feb 2026

https://github.com/lukashass/flink-devcontainer

Apache Flink development environment using Gitpod or VSCode Remote-Containers

apache-flink devcontainer flink gitpod remote-containers vscode

Last synced: 24 Mar 2025

https://github.com/angeligareta/flink-overview

This project aims to predict the delays on the Yellow taxi dataset, by implementing an application based on Apache Flink.

apache apache-flink java-8 upm yellow-taxi

Last synced: 16 Mar 2025

https://github.com/j3-signalroom/linux_flink_with_iceberg

Apache Flink Docker image with Apache Iceberg support for Linux (i.e., non-Mac M chip).

apache-flink apache-iceberg flink iceberg

Last synced: 18 Mar 2026

https://github.com/shixi99/data-streaming-kafk-flink

This is a simple showcase of data streaming using different technologies such as Kafka, Flink, PostgreSQL

apache-flink apache-kafka docker postgressql

Last synced: 17 May 2026

https://github.com/kzmlabs/flink-statefun

Actively maintained continuation of Apache Flink Stateful Functions — updated for Flink 2.2.0 and Java 21, published to Maven Central as io.github.kzmlabs.flinkstatefun. Stateful serverless stream processing on Kubernetes.

actor-model apache-flink distributed-systems event-driven event-driven-architecture flink flink-kubernetes-operator java java21 kafka kubernetes maven-central rocksdb serverless stateful-functions stateful-stream-processing statefun stream-processing

Last synced: 29 Apr 2026

https://github.com/mrsimpson/slides-data-in-motion

Presentation about Event Stream Processing

apache-flink slidev streaming

Last synced: 29 Jan 2026

https://github.com/j3-signalroom/mac_flink_with_iceberg

Apache Flink Docker image with Apache Iceberg support for Mac M2, M3, or M4 chips.

apache-flink apache-iceberg flink iceberg

Last synced: 18 Mar 2026

https://github.com/hieuung/streaming-kafka

Using various data processing tool for real time data pipeline with Kafka

apache-beam apache-flink apache-spark kafka kafka-consumer kafka-producer spark-streaming spark-streaming-kafka

Last synced: 27 Feb 2026

https://github.com/gordonmurray/flink-connector-iggy

A Flink 1.18 Source Connector for Apache Iggy (v0.7.0) with Flink SQL support, metrics, and TLS.

apache-flink apache-iggy

Last synced: 04 Apr 2026