Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-data-activation
A curated list of awesome Data Activation resources, libraries, tools and applications.
https://github.com/nagstler/awesome-data-activation
Last synced: 1 day ago
JSON representation
-
ETL (Extract, Transform, Load)
-
Open Source ETL Tools
- Apache NiFi - A powerful and scalable system to process and distribute data between disparate systems.
- Singer - An open-source standard for writing scripts that move data between databases, web APIs, files, and more.
- Apache Kafka - A distributed streaming platform that can be used for building real-time data pipelines and streaming apps.
- Apache NiFi - A powerful and scalable system to process and distribute data between disparate systems.
- Singer - An open-source standard for writing scripts that move data between databases, web APIs, files, and more.
- Meltano - An open source ELT platform built by GitLab that enables you to integrate various data sources and destinations.
- Apache Kafka - A distributed streaming platform that can be used for building real-time data pipelines and streaming apps.
-
Commercial ETL Platforms
- Fivetran - A cloud-based data integration platform that enables data engineers to build data pipelines to sync data from various sources to data warehouses.
- Stitch - A cloud-first, developer-focused platform for rapidly moving data from source to destination.
- Talend - A unified platform for data integration and data integrity to solve complex data challenges at scale.
- AWS Glue - A fully managed extract, transform, and load (ETL) service that makes it easy to prepare and load data for analytics.
- Fivetran - A cloud-based data integration platform that enables data engineers to build data pipelines to sync data from various sources to data warehouses.
- Stitch - A cloud-first, developer-focused platform for rapidly moving data from source to destination.
- Talend - A unified platform for data integration and data integrity to solve complex data challenges at scale.
- AWS Glue - A fully managed extract, transform, and load (ETL) service that makes it easy to prepare and load data for analytics.
- Informatica PowerCenter - An enterprise-grade data integration platform for complex, high-performance data management.
-
-
Reverse ETL (rETL)
-
Open Source rETL Tools
- Meltano - An open-source ELT platform that supports reverse ETL through its extensive plugin ecosystem.
- Grouparoo - An open-source framework for syncing customer data from your data warehouse to cloud-based tools.
- Jitsu - An open-source data integration platform that can handle both ETL and reverse ETL processes.
- Grouparoo - An open-source framework for syncing customer data from your data warehouse to cloud-based tools.
- Multiwoven - An open-source Reverse ETL platform that enables real-time data synchronization between data warehouses and business tools.
- Jitsu - An open-source data integration platform that can handle both ETL and reverse ETL processes.
-
Commercial rETL Platforms
- Census - A reverse ETL platform that syncs data from your warehouse to your business tools, enabling operational analytics.
- Polytomic - A data activation platform that moves data from your data warehouse into your business tools.
- Omnata - A reverse ETL platform that specializes in syncing data from cloud data warehouses to Salesforce.
- Seekwell - A reverse ETL tool that allows you to build data products and sync warehouse data to your tools.
- Segment Reverse ETL - Part of Segment's CDP, this feature allows you to send computed traits and audiences to downstream tools.
- Omnata - A reverse ETL platform that specializes in syncing data from cloud data warehouses to Salesforce.
- Census - A reverse ETL platform that syncs data from your warehouse to your business tools, enabling operational analytics.
- Polytomic - A data activation platform that moves data from your data warehouse into your business tools.
- Seekwell - A reverse ETL tool that allows you to build data products and sync warehouse data to your tools.
- Segment Reverse ETL - Part of Segment's CDP, this feature allows you to send computed traits and audiences to downstream tools.
-
-
Data Warehouses and Lakes
-
Cloud Data Warehouses
- Databricks SQL - A cloud data warehouse built on an open lakehouse architecture, offering high performance and seamless integration with data lakes.
- Amazon Redshift - A fully managed, petabyte-scale data warehouse service in the cloud, part of the AWS ecosystem.
- Google BigQuery - A serverless, highly scalable, and cost-effective multi-cloud data warehouse designed for business agility.
- Firebolt - A cloud data warehouse architected for high performance and efficiency on large-scale data.
- Databricks SQL - A cloud data warehouse built on an open lakehouse architecture, offering high performance and seamless integration with data lakes.
- Amazon Redshift - A fully managed, petabyte-scale data warehouse service in the cloud, part of the AWS ecosystem.
- Google BigQuery - A serverless, highly scalable, and cost-effective multi-cloud data warehouse designed for business agility.
- Firebolt - A cloud data warehouse architected for high performance and efficiency on large-scale data.
-
Data Lakes
- Amazon S3 - Object storage built to store and retrieve any amount of data from anywhere, commonly used as a data lake solution.
- Azure Data Lake Storage - A highly scalable data lake solution for big data analytics, built on Azure Blob Storage.
- Google Cloud Storage - A unified object storage for developers and enterprises, from live applications data to cloud archival.
- Databricks Delta Lake - An open-source storage layer that brings reliability to data lakes, implemented by Databricks.
- Cloudera Data Platform - A hybrid data platform for data engineering, streaming analytics, and data science workloads.
- Amazon S3 - Object storage built to store and retrieve any amount of data from anywhere, commonly used as a data lake solution.
- Azure Data Lake Storage - A highly scalable data lake solution for big data analytics, built on Azure Blob Storage.
- Google Cloud Storage - A unified object storage for developers and enterprises, from live applications data to cloud archival.
- Databricks Delta Lake - An open-source storage layer that brings reliability to data lakes, implemented by Databricks.
- Cloudera Data Platform - A hybrid data platform for data engineering, streaming analytics, and data science workloads.
-
Data Lakehouses
- AWS Lake Formation - A service that makes it easy to set up, secure, and manage your data lake.
- Dremio - A data lakehouse platform that delivers high-performance SQL querying directly on cloud data lake storage.
- Starburst - A data lakehouse platform built on open source Trino, providing fast analytics across varied data sources.
- Oracle Autonomous Data Warehouse - A cloud-native data lakehouse solution that combines elements of both data warehouses and data lakes.
- AWS Lake Formation - A service that makes it easy to set up, secure, and manage your data lake.
- Dremio - A data lakehouse platform that delivers high-performance SQL querying directly on cloud data lake storage.
- Starburst - A data lakehouse platform built on open source Trino, providing fast analytics across varied data sources.
- Oracle Autonomous Data Warehouse - A cloud-native data lakehouse solution that combines elements of both data warehouses and data lakes.
-
Programming Languages
Sub Categories