Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with flink
A curated list of projects in awesome lists tagged with flink .
https://github.com/zhisheng17/flink-learning
flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
clickhouse elasticsearch flink hbase influxdb kafka loki mysql opentsdb rabbitmq redis rocketmq spark stream-processing streaming
Last synced: 30 Dec 2024
https://github.com/risingwavelabs/risingwave
Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.
analytics big-data cloud-native data-engineering database distributed-database etl flink kafka ksqldb materialized-view postgres postgresql real-time real-time-analytics rust serverless spark-streaming sql stream-processing
Last synced: 30 Dec 2024
https://github.com/RisingWaveLabs/risingwave
SQL stream processing, analytics, and management. We decouple storage and compute to offer instant failover, dynamic scaling, speedy bootstrapping, and efficient joins.
analytics big-data cloud-native data-engineering database distributed-database etl flink kafka ksqldb materialized-view postgres postgresql real-time real-time-analytics rust serverless spark-streaming sql stream-processing
Last synced: 01 Nov 2024
https://github.com/apache/zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
big-data database flink java javascript nosql scala spark zeppelin
Last synced: 30 Dec 2024
https://github.com/apache/flink-cdc
Flink CDC is a streaming data integration tool
batch cdc change-data-capture data-integration data-pipeline distributed elt etl flink kafka mysql paimon postgresql real-time schema-evolution
Last synced: 31 Dec 2024
https://github.com/zq2599/blog_demos
CSDN博客专家程序员欣宸的github,这里有六百多篇原创文章的详细分类和汇总,以及对应的源码,内容涉及Java、Docker、Kubernetes、DevOPS等方面
docker docker-java docker-jib flink java jenkins kubernetes kubernetes-java kubernetes-jenkins-maven spring spring-cloud spring-cloud-kubernetes springboot
Last synced: 31 Dec 2024
https://github.com/flink-china/flink-training-course
Flink 中文视频课程(持续更新...)
course flink streaming training
Last synced: 08 Nov 2024
https://github.com/water8394/flink-recommandsystem-demo
:helicopter::rocket:基于Flink实现的商品实时推荐系统。flink统计商品热度,放入redis缓存,分析日志信息,将画像标签和实时记录放入Hbase。在用户发起推荐请求后,根据用户画像重排序热度榜,并结合协同过滤和标签两个推荐模块为新生成的榜单的每一个产品添加关联产品,最后返回新的用户列表。
flink flink-examples flink-hbase flink-kafka flink-redis recommand recommander-system
Last synced: 02 Jan 2025
https://github.com/water8394/flink-recommandSystem-demo
:helicopter::rocket:基于Flink实现的商品实时推荐系统。flink统计商品热度,放入redis缓存,分析日志信息,将画像标签和实时记录放入Hbase。在用户发起推荐请求后,根据用户画像重排序热度榜,并结合协同过滤和标签两个推荐模块为新生成的榜单的每一个产品添加关联产品,最后返回新的用户列表。
flink flink-examples flink-hbase flink-kafka flink-redis recommand recommander-system
Last synced: 13 Sep 2024
https://github.com/dtstack/chunjun
A data integration framework
bigdata data-integration flink framework java
Last synced: 31 Dec 2024
https://github.com/DTStack/chunjun
A data integration framework
bigdata data-integration flink framework java
Last synced: 26 Oct 2024
https://github.com/alibaba/alink
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
apriori classification clustering data-mining feature-engineering flink flink-machine-learning flink-ml fm graph-algorithms graph-embedding kafka machine-learning recommender recommender-system regression statistics word2vec xgboost
Last synced: 31 Dec 2024
https://github.com/alibaba/Alink
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
apriori classification clustering data-mining feature-engineering flink flink-machine-learning flink-ml fm graph-algorithms graph-embedding kafka machine-learning recommender recommender-system regression statistics word2vec xgboost
Last synced: 26 Oct 2024
https://github.com/webankfintech/dataspherestudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
airflow atlas azkaban dataworks davinci dolphinscheduler flink governance griffin hadoop hive hue kettle linkis spark supperset tableau visualis workflow zeppelin
Last synced: 31 Dec 2024
https://github.com/WeBankFinTech/DataSphereStudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
airflow atlas azkaban dataworks davinci dolphinscheduler flink governance griffin hadoop hive hue kettle linkis spark supperset tableau visualis workflow zeppelin
Last synced: 26 Oct 2024
https://github.com/DataLinkDC/dinky
Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
datalake datawarehouse flink flinkcdc flinksql olap real-time-computing-platform sql
Last synced: 30 Oct 2024
https://github.com/apache/incubator-paimon
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
big-data data-ingestion flink paimon real-time-analytics spark streaming-datalake table-store
Last synced: 18 Dec 2024
https://github.com/apache/paimon
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
big-data data-ingestion flink paimon real-time-analytics spark streaming-datalake table-store
Last synced: 31 Dec 2024
https://github.com/lakesoul-io/LakeSoul
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
arrow big-data datafusion datalake flink huggingface lakehouse lakesoul postgresql python pytorch rust spark sql streaming vectorized velox
Last synced: 30 Oct 2024
https://github.com/lakesoul-io/lakesoul
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
arrow big-data datafusion datalake flink huggingface lakehouse lakesoul postgresql python pytorch rust spark sql streaming vectorized velox
Last synced: 01 Jan 2025
https://github.com/geekyouth/szt-bigdata
深圳地铁大数据客流分析系统🚇🚄🌟
cdh6 clickhouse docker elasticsearch flink hadoop hbase hive kafka kibana kylin mongodb mysql phoenix redis scala spark springboot szt-bigdata zookeeper
Last synced: 03 Jan 2025
https://github.com/geekyouth/SZT-bigdata
深圳地铁大数据客流分析系统🚇🚄🌟
cdh6 clickhouse docker elasticsearch flink hadoop hbase hive kafka kibana kylin mongodb mysql phoenix redis scala spark springboot szt-bigdata zookeeper
Last synced: 31 Oct 2024
https://github.com/Qihoo360/Quicksql
A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
Last synced: 30 Oct 2024
https://github.com/qihoo360/quicksql
A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
Last synced: 03 Jan 2025
https://github.com/dtstack/flinkstreamsql
基于开源的flink,对其实时sql进行扩展;主要实现了流与维表的join,支持原生flink SQL所有的语法
Last synced: 03 Jan 2025
https://github.com/DTStack/flinkStreamSQL
基于开源的flink,对其实时sql进行扩展;主要实现了流与维表的join,支持原生flink SQL所有的语法
Last synced: 30 Oct 2024
https://github.com/alibaba/sreworks
Cloud Native DataOps & AIOps Platform | 云原生数智运维平台
aiops application cloudnative dataops devops engineering flink k8s kubernetes maintenance oam operation ops saas sre
Last synced: 03 Jan 2025
https://github.com/alibaba/SREWorks
Cloud Native DataOps & AIOps Platform | 云原生数智运维平台
aiops application cloudnative dataops devops engineering flink k8s kubernetes maintenance oam operation ops saas sre
Last synced: 30 Oct 2024
https://github.com/bytedance/bitsail
BitSail is a distributed high-performance data integration engine which supports batch, streaming and incremental scenarios. BitSail is widely used to synchronize hundreds of trillions of data every day.
big-data data-integration data-lake data-pipeline data-synchronization flink high-performance real-time
Last synced: 04 Jan 2025
https://github.com/dtstack/taier
Taier is a big data development platform for submission, scheduling, operation and maintenance, and indicator information display
azkaban chunjun cronjob-scheduler dag data-schedule distributed-schedule-system flink hadoop hive job-scheduler scheduler spark task-schedule workflow-scheduling-system
Last synced: 02 Jan 2025
https://github.com/DTStack/Taier
Taier is a big data development platform for submission, scheduling, operation and maintenance, and indicator information display
azkaban chunjun cronjob-scheduler dag data-schedule distributed-schedule-system flink hadoop hive job-scheduler scheduler spark task-schedule workflow-scheduling-system
Last synced: 30 Oct 2024
https://github.com/obenner/data-engineering-interview-questions
More than 2000+ Data engineer interview questions.
airflow avro aws azure cassandra data-engineering data-structures flink flume hadoop hadoop-hdfs hbase hive impala interview interview-questions kafka nifi spark sql
Last synced: 02 Jan 2025
https://github.com/OBenner/data-engineering-interview-questions
More than 2000+ Data engineer interview questions.
airflow avro aws azure cassandra data-engineering data-structures flink flume hadoop hadoop-hdfs hbase hive impala interview interview-questions kafka nifi spark sql
Last synced: 07 Nov 2024
https://github.com/datavane/tis
Support agile DataOps Based on Flink, DataX and Flink-CDC, Chunjun with Web-UI
cdc chunjun dataops datax etl flink flink-streaming java
Last synced: 03 Jan 2025
https://github.com/ververica/flink-sql-cookbook
The Apache Flink SQL Cookbook is a curated collection of examples, patterns, and use cases of Apache Flink SQL. Many of the recipes are completely self-contained and can be run in Ververica Platform as is.
apache-flink flink flink-sql sql stream-processing
Last synced: 12 Nov 2024
https://github.com/intsmaze/flink-boot
懒松鼠Flink-Boot 脚手架让Flink全面拥抱Spring生态体系,使得开发者可以以Java WEB开发模式开发出分布式运行的流处理程序,懒松鼠让跨界变得更加简单。懒松鼠旨在让开发者以更底上手成本(不需要理解分布式计算的理论知识和Flink框架的细节)便可以快速编写业务代码实现。为了进一步提升开发者使用懒松鼠脚手架开发大型项目的敏捷的度,该脚手架默认集成Spring框架进行Bean管理,同时将微服务以及WEB开发领域中经常用到的框架集成进来,进一步提升开发速度。比如集成Mybatis ORM框架,Hibernate Validator校验框架,Spring Retry重试框架等,具体见下面的脚手架特性。
bigdata flink flink-boot java java-flink mcv mybatis sping spring-boot spring-retry
Last synced: 30 Oct 2024
https://github.com/nielsbasjes/yauaa
Yet Another UserAgent Analyzer
analyzer apache-beam apache-flink apache-hive client-hints flink hive java nifi-processor nifi-processors parse snowflake snowplow snowplowanalytics trino-plugin user-agent user-agent-analysis user-agent-parser useragent-parser useragentparser
Last synced: 03 Jan 2025
https://github.com/apache/flink-kubernetes-operator
Apache Flink Kubernetes Operator
Last synced: 05 Nov 2024
https://github.com/lw-lin/streaming-readings
Streaming System 相关的论文读物
apache-spark dataflow drizzle flink heron millwheel s4 spark-streaming spe storm stream-processing stream-processing-engine streaming streaming-engine
Last synced: 03 Jan 2025
https://github.com/touk/nussknacker
Low-code tool for automating actions on real time data | Stream processing for the users.
apache-flink automation big-data data-streaming decision-engine decision-making decisioning flink flink-kafka gui kafka low-code lowcode real-time rules-engine scala stream-processing streaming touk
Last synced: 02 Jan 2025
https://github.com/TouK/nussknacker
Low-code tool for automating actions on real time data | Stream processing for the users.
apache-flink automation big-data data-streaming decision-engine decision-making decisioning flink flink-kafka gui kafka low-code lowcode real-time rules-engine scala stream-processing streaming touk
Last synced: 31 Oct 2024
https://github.com/WeBankFinTech/WeDataSphere
WeDataSphere is a financial grade, one-stop big data platform suite.
analytics bigdata data-analysis datafabric datagovernance dataspherestudio exchangis flink hadoop hive ide linkis prophecis qualitis schedulis scriptis spark streamis visualis
Last synced: 30 Oct 2024
https://github.com/WeBankFinTech/Exchangis
Exchangis is a lightweight,highly extensible data exchange platform that supports data transmission between structured and unstructured heterogeneous data sources
dataspherestudio datax etl exchangis flink linkis sqoop transmission-engine wedatasphere
Last synced: 30 Oct 2024
https://github.com/harbby/sylph
Stream computing platform for bigdata
big-data flink java spark-streaming sql streamsql sylph
Last synced: 30 Dec 2024
https://github.com/pierre94/flink-notes
flink学习笔记
bigdata flink flink-notes flinkx
Last synced: 05 Nov 2024
https://github.com/ivi-ru/flink-clickhouse-sink
Flink sink for Clickhouse
clickhouse flink flink-clickhouse-sink java
Last synced: 13 Nov 2024
https://github.com/flowerfine/scaleph
Open data platform based on Kubernetes. Scaleph supports SeaTunnel、Flink and Doris backended by SeaTunnel on Flink engine、Flink Kubernetes Operator and Doris operator.
dag data-platform dataops doris doris-manager doris-operator flink flink-kubernetes flink-kubernetes-operator flink-sql flink-sql-gateway seatunnel
Last synced: 05 Nov 2024
https://github.com/itinycheng/flink-connector-clickhouse
Flink SQL connector for ClickHouse. Support ClickHouseCatalog and read/write primary data, maps, arrays to clickhouse.
clickhouse connector flink flink-connector
Last synced: 05 Nov 2024
https://github.com/lightbend/cloudflow
Cloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.
akka cloudflow flink kubernetes microservices-architectures spark streaming-applications streaming-data streaming-runtimes
Last synced: 03 Jan 2025
https://github.com/apache/flink-ml
Machine learning library of Apache Flink
big-data flink java machine-learning ml python
Last synced: 30 Dec 2024
https://github.com/kamu-data/kamu-cli
Next-generation decentralized data lakehouse and a multi-party stream processing network
blockchain data-as-code data-management data-science datafusion flink jupyter kamu open-data open-data-fabric spark sql
Last synced: 03 Jan 2025
https://github.com/DTStack/dt-sql-parser
SQL Parsers for BigData, built with antlr4.
antlr4 autocompletion bigdata flink hive impala mysql parser postgresql spark sql sql-validation trino
Last synced: 02 Nov 2024
https://github.com/dtstack/dt-sql-parser
SQL Parsers for BigData, built with antlr4.
antlr4 autocompletion bigdata flink hive impala mysql parser postgresql spark sql sql-validation trino
Last synced: 03 Jan 2025
https://github.com/apache/doris-flink-connector
Flink Connector for Apache Doris
apache connector data-warehousing dbms doris flink mpp olap
Last synced: 05 Nov 2024
https://github.com/streamnative/pulsar-flink
Elastic data processing with Apache Pulsar and Apache Flink
apache-flink apache-pulsar batch-processing catalog data-processing flink flink-connector flink-stream-processing pulsar schema schema-registry sql stream-processing
Last synced: 19 Nov 2024
https://github.com/bytedance/cloudshuffleservice
Cloud Shuffle Service(CSS) is a general purpose remote shuffle solution for compute engines, including Spark/Flink/MapReduce.
Last synced: 31 Dec 2024
https://github.com/luxiaoxun/eagle
Real time data processing system based on flink and CEP
cep complex-event-processing drools flink realtime-processing siddhi
Last synced: 30 Dec 2024
https://github.com/bytedance/CloudShuffleService
Cloud Shuffle Service(CSS) is a general purpose remote shuffle solution for compute engines, including Spark/Flink/MapReduce.
Last synced: 05 Nov 2024
https://github.com/jeff-zou/flink-connector-redis
Asynchronous flink connector based on the Lettuce, supporting sql join and sink, query caching and debugging.
flink flink-connector flink-connector-redis flink-sql join lettuce redis
Last synced: 05 Nov 2024
https://github.com/spotify/flink-on-k8s-operator
Kubernetes operator for managing the lifecycle of Apache Flink and Beam applications.
apache-beam apache-flink flink flink-operator kubernetes kubernetes-operator
Last synced: 04 Jan 2025
https://github.com/knaufk/flink-faker
A data generator source connector for Flink SQL based on data-faker.
Last synced: 05 Nov 2024
https://github.com/flink-extended/flink-remote-shuffle
Remote Shuffle Service for Flink
Last synced: 05 Nov 2024
https://github.com/godaai/flink-book-zh
Flink Tutorial Project
bigdata flink flink-examples flink-stream-processing
Last synced: 05 Nov 2024
https://github.com/getindata/flink-http-connector
Http Connector for Apache Flink. Provides sources and sinks for Datastream , Table and SQL APIs.
data-streaming flink flink-sql flink-stream-processing java
Last synced: 04 Jan 2025
https://github.com/hortonworks/streamline
StreamLine - Streaming Analytics
flink kafka kafka-streams real-time spark-streaming storm streaming
Last synced: 30 Oct 2024
https://github.com/ing-bank/flink-deployer
A tool that help automate deployment to an Apache Flink cluster
apache-flink deployment docker flink go golang
Last synced: 08 Nov 2024
https://github.com/zuinnote/hadoopcryptoledger
Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
bigdata bitcoin blockchain cryptoledger ethereum flink hadoop hive spark
Last synced: 02 Jan 2025
https://github.com/apache/flink-connector-kafka
Apache flink
connector datastream flink kafka sql table
Last synced: 03 Jan 2025
https://github.com/sansa-stack/sansa-stack
Big Data RDF Processing and Analytics Stack built on Apache Spark and Apache Jena http://sansa-stack.github.io/SANSA-Stack/
apache-jena apache-spark distributed-computing flink rdf semantic-web spark
Last synced: 03 Jan 2025
https://github.com/SANSA-Stack/SANSA-Stack
Big Data RDF Processing and Analytics Stack built on Apache Spark and Apache Jena http://sansa-stack.github.io/SANSA-Stack/
apache-jena apache-spark distributed-computing flink rdf semantic-web spark
Last synced: 20 Nov 2024
https://github.com/apache/flink-shaded
Apache Flink shaded artifacts repository
Last synced: 02 Jan 2025
https://github.com/apache/flink-connector-jdbc
Apache flink
connector datastream flink jdbc sql table
Last synced: 03 Jan 2025
https://github.com/streamnative/pulsar-spark
Spark Connector to read and write with Pulsar
apache-pulsar apache-spark batch-processing data-processing data-science flink spark spark-sql stream-processing structured-streaming
Last synced: 31 Dec 2024
https://github.com/HamaWhiteGG/flink-sql-security
FlinkSQL数据脱敏和行级权限解决方案及源码,支持面向用户级别的数据脱敏和行级数据访问控制,即特定用户只能访问到脱敏后的数据或授权过的行。此方案是实时领域Flink的解决方案,类似于离线数仓Hive Ranger中的Row-level Filter和Column Masking方案。
Last synced: 05 Nov 2024
https://github.com/water8394/flink-simple-tutorial
:bell::pill:flink简易使用教程,结合官方仓库的example样例,结合常见场景,使用flink的基本功能
Last synced: 10 Nov 2024