An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with data-orchestration

A curated list of projects in awesome lists tagged with data-orchestration .

https://github.com/kestra-io/kestra

:zap: Workflow Automation Platform. Orchestrate & Schedule code in any language, run anywhere, 600+ plugins. Alternative to Airflow, n8n, Rundeck, VMware vRA, Zapier ...

automation data-orchestration devops high-availability infrastructure-as-code java low-code lowcode orchestration pipeline pipeline-as-code workflow

Last synced: 20 Feb 2026

https://github.com/alluxio/alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud

alluxio data-analysis data-orchestration hadoop memory-speed presto spark tensorflow virtual-distributed-filesystem

Last synced: 08 Jan 2026

https://github.com/Alluxio/alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud

alluxio data-analysis data-orchestration hadoop memory-speed presto spark tensorflow virtual-distributed-filesystem

Last synced: 26 Mar 2025

https://github.com/apache/incubator-graphar

An open source, standard data file format for graph data storage and retrieval.

big-data data-orchestration etl graph graph-analysis graph-storage pyspark spark

Last synced: 24 Oct 2025

https://github.com/iam-mhaseeb/skytrax-data-warehouse

A full data warehouse infrastructure with ETL pipelines running inside docker on Apache Airflow for data orchestration, AWS Redshift for cloud data warehouse and Metabase to serve the needs of data visualizations such as analytical dashboards.

airflow data-analysis data-analytics data-cleaning data-engineering data-orchestration data-processing data-visualization data-warehouse data-warehousing database docker metabase python python3 redshift s3 s3-bucket sql

Last synced: 12 Aug 2025

https://github.com/flyingriverhorse/skyulf

Build and ship production ML pipelines faster: a pipeline library with an optional self-hosted visual layer for modular, reproducible workflows, local testing, and experiment tracking.

celery data-orchestration deep-learning docker-compose experiment-tracking feature-engineering-python local-first low-code machine-learning ml-pipeline ml-platform-workflow mlops mlops-workflow model-deployment model-registry privacy-first react redis self-hosted visual-programming

Last synced: 25 Apr 2026

https://github.com/dagster-io/skills

A collection of Claude Code plugins for working with Dagster.

ai-tools claude-code dagster data-engineering data-orchestration marketplace

Last synced: 09 Apr 2026

https://github.com/jonathanneo/data-aware-orchestration

Data-aware orchestration with dagster, dbt, and airbyte

data-orchestration

Last synced: 05 May 2025

https://github.com/kestra-io/examples

Best practices for data workflows, integrations with the Modern Data Stack (MDS), Infrastructure as Code (IaC), Cloud Provider Services

analytics-engineering automation data-engineering data-orchestration data-pipelines data-workflows orchestration

Last synced: 09 Oct 2025

https://github.com/sap-samples/btp-data-to-value-workshop

This repo contains a dataset, exercises, and sample code for an end-to-end SAP BTP data-to-value bootcamp covering SAP HANA Cloud, SAP Data Warehouse Cloud, SAP Data Intelligence Cloud, and SAP Analytics Cloud.

advanced-analytics analytics data-management data-orchestration data-science data-to-value machine-learning predictive-planning sample sample-code sap-analytics-cloud sap-btp sap-data-intelligence-cloud sap-data-warehouse-cloud sap-hana-cloud workshop

Last synced: 13 Apr 2025

https://github.com/astronomer/airflow-provider-fivetran-async

A new Airflow Provider for Fivetran, maintained by Astronomer and Fivetran

airflow airflow-operator airflow-provider apache-airflow dag data-orchestration etl python workflow

Last synced: 06 Apr 2025

https://github.com/anna-geller/kestra-ci-cd

CI/CD repository template to automate deployments of your production flows

automation data-engineering data-orchestration data-pipelines data-workflows orchestration

Last synced: 04 Mar 2026

https://github.com/nshkrdotcom/flowstone

Asset-first data orchestration for Elixir/BEAM. Dagster-inspired with OTP fault tolerance, LiveView dashboard, lineage tracking, checkpoint gates, and distributed execution via Oban.

asset-management beam chaos-engineering dag data-lineage data-orchestration data-pipeline ecto elixir fault-tolerance human-in-the-loop multi-tenant nshkr-ai-agents oban otp phoenix-liveview scheduling tdd telemetry workflow-engine

Last synced: 13 Jan 2026

https://github.com/alluxio/k8s-operator

An operator for managing Alluxio system on Kubernetes cluster

alluxio data-analysis data-orchestration kubernetes kubernetes-operator machine-learning

Last synced: 15 Aug 2025

https://github.com/gitbrincie212/chronographer

ChronoGrapher is a WIP project that aims to implement a flexible multi-language scheduler, allowing for multiple programming languages to interact with one and another or used by only one

automation chronographer cron data-engineering data-orchestration data-science java javascript python python3 rust rust-crate rust-lang rust-library schedule scheduler typescript workflow-engine workflow-orchestration

Last synced: 23 Sep 2025

https://github.com/taquynhnga2001/proptech-dagster

Build an ELT pipeline with dagster and dbt to schedule loading HDB resale transactions in Singapore into Google BigQuery data warehouse, then create Power BI dashboard to enhance insight exploration.

bigquery dagster data-integration data-orchestration data-warehouse dbt elt etl powerbi python

Last synced: 14 Feb 2026

https://github.com/jasontanx/prefect-learning

Prefect - Data orchestration tool practice & learning

data-engineer data-orchestration prefect workflow-management

Last synced: 26 Mar 2025

https://github.com/ryandmonk/knowledge_graph_brain

MCP-native knowledge graph orchestrator that unifies data silos with GraphRAG, dynamic connectors, and local AI.

data-orchestration enterprise-ai graphrag knowledge-graph langgraph local-ai model-context-protocol neo4j ollama rag typescript vector-search

Last synced: 18 Sep 2025

https://github.com/dagster-io/dagster-claude-plugins

A collection of Claude Code plugins for working with Dagster.

ai-tools claude-code dagster data-engineering data-orchestration marketplace

Last synced: 21 Jan 2026

https://github.com/gades-dataeng/webinar

Code, scripts, and resources for the Data Engineering Fundamentals Course Webinar, covering Python, data pipelines, Apache Airflow, and more.

apache-airflow data-engineering data-orchestration data-orchestrator data-pipelines dimensional-modeling python sql

Last synced: 31 Mar 2025

https://github.com/benzsevern/goldenpipe

Golden Suite orchestrator — chains validation, transformation, and entity resolution. 4 MCP tools on Smithery.

a2a agent cli data-engineering data-orchestration data-pipeline data-quality etl fastapi golden-suite mcp mcp-server orchestration pipeline pluggable polars python remote-mcp tui yaml

Last synced: 04 Apr 2026

https://github.com/kingabzpro/5-airflow-alternatives-for-data-orchestration-tutorial

Code examples of Luigi, Prefect, Kedro, Dagster, and MageAI

dagster data data-orchestration kedro luigi mageai prefect

Last synced: 18 Apr 2026

https://github.com/ddeutils/data-orchestra

❌ Full-Stack Data Orchestration config by Yaml template with Flask & HTMX

data-orchestration docker flask htmx python3

Last synced: 05 May 2026