Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

awesome-data-activation

A curated list of awesome Data Activation resources, libraries, tools and applications.
https://github.com/nagstler/awesome-data-activation

Last synced: 1 day ago
JSON representation

  • ETL (Extract, Transform, Load)

    • Open Source ETL Tools

      • Apache NiFi - A powerful and scalable system to process and distribute data between disparate systems.
      • Singer - An open-source standard for writing scripts that move data between databases, web APIs, files, and more.
      • Apache Kafka - A distributed streaming platform that can be used for building real-time data pipelines and streaming apps.
      • Apache NiFi - A powerful and scalable system to process and distribute data between disparate systems.
      • Singer - An open-source standard for writing scripts that move data between databases, web APIs, files, and more.
      • Meltano - An open source ELT platform built by GitLab that enables you to integrate various data sources and destinations.
      • Apache Kafka - A distributed streaming platform that can be used for building real-time data pipelines and streaming apps.
    • Commercial ETL Platforms

      • Fivetran - A cloud-based data integration platform that enables data engineers to build data pipelines to sync data from various sources to data warehouses.
      • Stitch - A cloud-first, developer-focused platform for rapidly moving data from source to destination.
      • Talend - A unified platform for data integration and data integrity to solve complex data challenges at scale.
      • AWS Glue - A fully managed extract, transform, and load (ETL) service that makes it easy to prepare and load data for analytics.
      • Fivetran - A cloud-based data integration platform that enables data engineers to build data pipelines to sync data from various sources to data warehouses.
      • Stitch - A cloud-first, developer-focused platform for rapidly moving data from source to destination.
      • Talend - A unified platform for data integration and data integrity to solve complex data challenges at scale.
      • AWS Glue - A fully managed extract, transform, and load (ETL) service that makes it easy to prepare and load data for analytics.
      • Informatica PowerCenter - An enterprise-grade data integration platform for complex, high-performance data management.
  • Reverse ETL (rETL)

    • Open Source rETL Tools

      • Meltano - An open-source ELT platform that supports reverse ETL through its extensive plugin ecosystem.
      • Grouparoo - An open-source framework for syncing customer data from your data warehouse to cloud-based tools.
      • Jitsu - An open-source data integration platform that can handle both ETL and reverse ETL processes.
      • Grouparoo - An open-source framework for syncing customer data from your data warehouse to cloud-based tools.
      • Multiwoven - An open-source Reverse ETL platform that enables real-time data synchronization between data warehouses and business tools.
      • Jitsu - An open-source data integration platform that can handle both ETL and reverse ETL processes.
    • Commercial rETL Platforms

      • Census - A reverse ETL platform that syncs data from your warehouse to your business tools, enabling operational analytics.
      • Polytomic - A data activation platform that moves data from your data warehouse into your business tools.
      • Omnata - A reverse ETL platform that specializes in syncing data from cloud data warehouses to Salesforce.
      • Seekwell - A reverse ETL tool that allows you to build data products and sync warehouse data to your tools.
      • Segment Reverse ETL - Part of Segment's CDP, this feature allows you to send computed traits and audiences to downstream tools.
      • Omnata - A reverse ETL platform that specializes in syncing data from cloud data warehouses to Salesforce.
      • Census - A reverse ETL platform that syncs data from your warehouse to your business tools, enabling operational analytics.
      • Polytomic - A data activation platform that moves data from your data warehouse into your business tools.
      • Seekwell - A reverse ETL tool that allows you to build data products and sync warehouse data to your tools.
      • Segment Reverse ETL - Part of Segment's CDP, this feature allows you to send computed traits and audiences to downstream tools.
  • Data Warehouses and Lakes

    • Cloud Data Warehouses

      • Databricks SQL - A cloud data warehouse built on an open lakehouse architecture, offering high performance and seamless integration with data lakes.
      • Amazon Redshift - A fully managed, petabyte-scale data warehouse service in the cloud, part of the AWS ecosystem.
      • Google BigQuery - A serverless, highly scalable, and cost-effective multi-cloud data warehouse designed for business agility.
      • Firebolt - A cloud data warehouse architected for high performance and efficiency on large-scale data.
      • Databricks SQL - A cloud data warehouse built on an open lakehouse architecture, offering high performance and seamless integration with data lakes.
      • Amazon Redshift - A fully managed, petabyte-scale data warehouse service in the cloud, part of the AWS ecosystem.
      • Google BigQuery - A serverless, highly scalable, and cost-effective multi-cloud data warehouse designed for business agility.
      • Firebolt - A cloud data warehouse architected for high performance and efficiency on large-scale data.
    • Data Lakes

      • Amazon S3 - Object storage built to store and retrieve any amount of data from anywhere, commonly used as a data lake solution.
      • Azure Data Lake Storage - A highly scalable data lake solution for big data analytics, built on Azure Blob Storage.
      • Google Cloud Storage - A unified object storage for developers and enterprises, from live applications data to cloud archival.
      • Databricks Delta Lake - An open-source storage layer that brings reliability to data lakes, implemented by Databricks.
      • Cloudera Data Platform - A hybrid data platform for data engineering, streaming analytics, and data science workloads.
      • Amazon S3 - Object storage built to store and retrieve any amount of data from anywhere, commonly used as a data lake solution.
      • Azure Data Lake Storage - A highly scalable data lake solution for big data analytics, built on Azure Blob Storage.
      • Google Cloud Storage - A unified object storage for developers and enterprises, from live applications data to cloud archival.
      • Databricks Delta Lake - An open-source storage layer that brings reliability to data lakes, implemented by Databricks.
      • Cloudera Data Platform - A hybrid data platform for data engineering, streaming analytics, and data science workloads.
    • Data Lakehouses

      • AWS Lake Formation - A service that makes it easy to set up, secure, and manage your data lake.
      • Dremio - A data lakehouse platform that delivers high-performance SQL querying directly on cloud data lake storage.
      • Starburst - A data lakehouse platform built on open source Trino, providing fast analytics across varied data sources.
      • Oracle Autonomous Data Warehouse - A cloud-native data lakehouse solution that combines elements of both data warehouses and data lakes.
      • AWS Lake Formation - A service that makes it easy to set up, secure, and manage your data lake.
      • Dremio - A data lakehouse platform that delivers high-performance SQL querying directly on cloud data lake storage.
      • Starburst - A data lakehouse platform built on open source Trino, providing fast analytics across varied data sources.
      • Oracle Autonomous Data Warehouse - A cloud-native data lakehouse solution that combines elements of both data warehouses and data lakes.