Projects in Awesome Lists tagged with data-ops
A curated list of projects in awesome lists tagged with data-ops .
https://github.com/prefecthq/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
automation data data-engineering data-ops data-science infrastructure ml-ops observability orchestration pipeline prefect python workflow workflow-engine
Last synced: 10 Apr 2026
https://github.com/avaiga/taipy
Turns Data and AI algorithms into production-ready web applications in no time.
automation data-engineering data-integration data-ops data-visualization datascience developer-tools hacktoberfest hacktoberfest2023 job-scheduler mlops orchestration pipeline pipelines python scenario scenario-analysis taipy-core taipy-gui workflow
Last synced: 05 Feb 2026
https://github.com/PrefectHQ/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
automation data data-engineering data-ops data-science infrastructure ml-ops observability orchestration pipeline prefect python workflow workflow-engine
Last synced: 24 Mar 2025
https://github.com/Avaiga/taipy
Turns Data and AI algorithms into production-ready web applications in no time.
automation data-engineering data-integration data-ops data-visualization datascience developer-tools hacktoberfest hacktoberfest2023 job-scheduler mlops orchestration pipeline pipelines python scenario scenario-analysis taipy-core taipy-gui workflow
Last synced: 05 Apr 2025
https://github.com/redpanda-data/connect
Fancy stream processing made operationally mundane
amqp cqrs data-engineering data-ops etl event-sourcing go golang kafka logs message-bus message-queue nats rabbitmq stream-processing stream-processor streaming-data
Last synced: 24 Apr 2026
https://github.com/Jeffail/benthos
Fancy stream processing made operationally mundane
amqp cqrs data-engineering data-ops etl event-sourcing go golang kafka logs message-bus message-queue nats rabbitmq stream-processing stream-processor streaming-data
Last synced: 25 Mar 2025
https://github.com/marquezproject/marquez
Collect, aggregate, and visualize a data ecosystem's metadata
data-dictionary data-discovery data-ecosystem-metadata data-governance data-lineage data-ops data-provenance marquez metadata metadata-service
Last synced: 13 May 2025
https://github.com/MarquezProject/marquez
Collect, aggregate, and visualize a data ecosystem's metadata
data-dictionary data-discovery data-ecosystem-metadata data-governance data-lineage data-ops data-provenance marquez metadata metadata-service
Last synced: 27 Mar 2025
https://marquezproject.github.io/marquez/
Collect, aggregate, and visualize a data ecosystem's metadata
data-dictionary data-discovery data-ecosystem-metadata data-governance data-lineage data-ops data-provenance marquez metadata metadata-service
Last synced: 05 May 2025
https://github.com/automaticmode/active_workflow
Polyglot workflows without leaving the comfort of your technology stack.
activeworkflow agents data-engineering data-ops event-driven ifttt orchestration-framework scheduler scheduling self-hosted services-platform workflow
Last synced: 14 Mar 2025
https://github.com/snowflakedb/snowflake-cli
Snowflake CLI is an open-source command-line tool explicitly designed for developer-centric workloads in addition to SQL operations.
cli data-ops devops-tools snowflake sql
Last synced: 01 Apr 2026
https://github.com/alexjc/weboptout
Opt-Out tool to check Copyright reservations in a way that even machines can understand.
command-line-tool copyright data-ops ml-pipeline opt-out robots-txt terms-of-service webscraping
Last synced: 13 Sep 2025
https://github.com/datachecks/dcs-core
Open Source Data Quality Monitoring.
data-engineering data-governance data-observability data-ops data-quality-monitor data-quality-monitoring data-validation database dataops dataquality elasticsearch etl metrics mlops monitoring mysql postgres postgresql python sql
Last synced: 03 Mar 2026
https://github.com/dqops/dqo
Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observability. Configure data quality checks from the UI or in YAML files, let DQOps run the data quality checks daily to detect data quality issues.
data-observability data-ops data-profiling data-quality data-quality-checks data-quality-measurement data-quality-monitoring data-quality-report monitoring
Last synced: 13 Dec 2025
https://github.com/aspuru-guzik-group/funsies
funsies is a lightweight workflow engine 🔧
automation data-engineering data-ops hashtree infrastructure python redis workflow-engine
Last synced: 05 Mar 2026
https://github.com/snowflakedb/snowflake-cli-action
Github Action enabling easy use of Snowflake CLI in your CI/CD workflows
actions data-ops devops-tools snowflake sql
Last synced: 23 Feb 2026
https://github.com/duyet/grant-rs
Manage Redshift/Postgres privileges in GitOps style written in Rust
data-engineering data-ops gitops hacktoberfest postgres redshift rust
Last synced: 14 Apr 2025
https://github.com/tosh2230/stairlight
A data lineage tool detects table dependencies from rendered SQL statements.
bigquery data-catalog data-discovery data-engineering data-governance data-lineage data-management data-ops dbt gcs lineage redash s3 sql
Last synced: 16 May 2025
https://github.com/glentner/dataphile
Data analytics library for Python and suite of open source, command line based data ops tools.
data-analysis data-ops data-science python scientific-computing
Last synced: 07 May 2025
https://github.com/marwan116/supreme-task
A prefect extension that builds on top of the task decorator to reduce negative engineering!
data-ops data-science infrastructure ml-ops orchestration prefect python workflow
Last synced: 23 Jun 2025
https://github.com/usedatabrew/open_ai_benthos_processor
Open AI processor for Benthos
benthos benthos-plugin data-engineering data-ops etl etl-pipeline event-sourcing golang-library openai stream-processor streaming-data
Last synced: 11 Apr 2026
https://github.com/itrauco/data-dirtying-tool
a simple command line tool to generate dirty data and do common data things in google cloud
data data-analysis data-engineering data-ops data-pipeline data-science data-visualization data-wrangling dirty-data google-cloud machine-learning
Last synced: 24 Feb 2025