Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-dataops
Awesome list of dataops products, open source and resources
https://github.com/datacoon/awesome-dataops
Last synced: 2 days ago
JSON representation
-
Opensource
-
Data Pipeline Orchestration
- Apache Airlow - Airflow is a platform created by the community to programmatically author, schedule and monitor workflows.
-
ETL tools
- Apache Kafka - a distributed streaming platform.
- Apache Nifi - an easy to use, powerful, and reliable system to process and distribute data.
-
-
Commercial products and services
-
Platforms
- Astronomer - spin up and scale Apache Airflow clusters
- Databand - Databand tracks your pipeline execution metadata, so you can evaluate changes in runtimes, code, data, and critical business KPIs.
- Prefect - is a new workflow management system, designed for modern infrastructure and powered by open-source software.
- Unravel - helps ops engineers, app developers, and enterprise architects reduce the complexity of delivering reliable application performance – providing unified visibility and operational intelligence to optimize your entire ecosystem
- Databand - Databand tracks your pipeline execution metadata, so you can evaluate changes in runtimes, code, data, and critical business KPIs.
-
-
Cloud ETL
-
Platforms
- AWS Glue - is a fully managed ETL (extract, transform, and load) service that makes it simple and cost-effective to categorize your data, clean it, enrich it, and move it reliably between various data stores.
- Azure Data Factory - a hybrid data integration service, simplified ETL operations
- ETLWorks - a cloud-first, any-to-any data integration platform
-
-
Data catalogs
-
Platforms
- Colibra Data Catalog - empowers business users to quickly discover and understand data that matters
- SQL Data catalog - tool to discover and classify sensitive data for MS SQL Server
-
Testing and monitoring
- RightData - is a data testing, reconciliation, validation suite that allows stakeholders in identifying issues related to data consistency, quality, completeness, and gaps.
-