An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with google-cloud-dataflow

A curated list of projects in awesome lists tagged with google-cloud-dataflow .

https://github.com/googlecloudplatform/professional-services

Common solutions and tools developed by Google Cloud's Professional Services team. This repository and its contents are not an officially supported Google product.

bigquery examples gke google-cloud-compute google-cloud-dataflow google-cloud-ml google-cloud-platform solutions tools

Last synced: 13 May 2025

https://github.com/GoogleCloudPlatform/professional-services

Common solutions and tools developed by Google Cloud's Professional Services team. This repository and its contents are not an officially supported Google product.

bigquery examples gke google-cloud-compute google-cloud-dataflow google-cloud-ml google-cloud-platform solutions tools

Last synced: 14 Mar 2025

https://github.com/GoogleCloudPlatform/DataflowJavaSDK

Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.

big-data data-analysis data-mining data-processing data-science google-cloud-dataflow

Last synced: 01 May 2025

https://github.com/googlecloudplatform/dataflowjavasdk

Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.

big-data data-analysis data-mining data-processing data-science google-cloud-dataflow

Last synced: 03 Oct 2025

https://github.com/snowplow-archive/google-cloud-dataflow-example-project

Example stream processing job, written in Scala with Apache Beam, for Google Cloud Dataflow

gcp google-bigtable google-cloud-dataflow google-cloud-platform google-cloud-pubsub scala

Last synced: 21 Apr 2025

https://github.com/ryanmcdowell/dataflow-pubsub-event-router

An example pipeline which re-publishes events to different topics based a message attribute.

apache-beam google-cloud-dataflow google-cloud-platform google-cloud-pubsub

Last synced: 18 Jul 2025

https://github.com/pompierninja/beam-amazon-batch-example

A practical example of batch processing on Google Cloud Dataflow using the Go SDK for Apache Beam :fire:

amazon apache-beam batch-processing big-data golang google-cloud-dataflow

Last synced: 14 Dec 2025

https://github.com/goatcheesesaladwithpeanutoildressing/beam-amazon-batch-example

A practical example of batch processing on Google Cloud Dataflow using the Go SDK for Apache Beam :fire:

amazon apache-beam batch-processing big-data golang google-cloud-dataflow

Last synced: 25 Feb 2025

https://github.com/mponce/google-cloud-dataflow-pipeline

Google Cloud DataFlow - Load CSV Files to BigQuery Tables

csv-import google-bigquery google-cloud-dataflow google-cloud-storage

Last synced: 23 Jan 2026

https://github.com/ryanmcdowell/dataflow-bigquery-dynamic-destinations

An example pipeline for dynamically routing events from Pub/Sub to different BigQuery tables based on a message attribute.

apache-beam bigquery google-cloud-dataflow google-cloud-platform

Last synced: 09 Sep 2025

https://github.com/viveknaskar/triggering-dataflow-pipeline-function

Google Cloud function to trigger cloud-dataflow pipeline when a file is uploaded into a cloud storage bucket

google-cloud-dataflow google-cloud-function google-cloud-platform javascript nodejs

Last synced: 20 Jun 2025

https://github.com/googlecloudplatform/dataflow-metrics-exporter

CLI tool to collect dataflow resource & execution metrics and export to either BigQuery or Google Cloud Storage. Tool will be useful to compare & visualize the metrics while benchmarking the dataflow pipelines using various data formats, resource configurations etc

apache-beam google-cloud-dataflow

Last synced: 08 Oct 2025

https://github.com/sinmetal/pug2pug

Cloud Dataflowを使って、Cloud DatastoreのMigrationを行う

google-cloud-dataflow java

Last synced: 18 Jun 2025

https://github.com/rm3l/apache-beam-java-firestore-batch-dataflow

Companion Repo for blog post : https://rm3l.org/batch-writes-to-google-cloud-firestore-using-the-apache-beam-java-sdk-on-google-cloud-dataflow/

apache-beam beam dataflow firestore google-cloud-dataflow google-cloud-firestore

Last synced: 26 Mar 2025

https://github.com/pompierninja/hands-on-apache-beam

Work In Progress - Une explication simple de qu'est-ce que c'est que le traitement par lots (batch) et le traitement par flux (stream) avec Apache Beam et Cloud Dataflow.

apache-beam google-cloud-dataflow

Last synced: 03 Mar 2026

https://github.com/goatcheesesaladwithpeanutoildressing/hands-on-apache-beam

Work In Progress - Une explication simple de qu'est-ce que c'est que le traitement par lots (batch) et le traitement par flux (stream) avec Apache Beam et Cloud Dataflow.

apache-beam google-cloud-dataflow

Last synced: 25 Feb 2025

https://github.com/kushal-bage/streaming-data-pipeline

📡 Build a robust streaming data pipeline using Docker, Kafka, Spark, and Cassandra for real-time ingestion, processing, and analytics.

dash data-mining data-science google-cloud-dataflow kafka lambda mongodb olap python spark sparta stream-processing streaming streaming-data topic-tracking triggers tweets twitter

Last synced: 07 Oct 2025

https://github.com/emediongfrancis/enhancing-data-quality-and-consistency-gcp-kafka-airflow-snowflake

This project focuses on maintaining data quality and consistency across different data sources. This project features Google Cloud Dataflow for data cataloging, Apache Airflow for ETL, Google Cloud Data Catalog for visual data preparation, and Snowflake for high-quality data storage and analysis.

apache-airflow apache-kafka data-quality google-cloud-data-catalog google-cloud-dataflow snowflake terraform veracity

Last synced: 20 Aug 2025