Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

awesome-data-pipeline

Awesome list for datapipeline
https://github.com/KennethanCeyer/awesome-data-pipeline

  • Apache Airflow - (Apache foundation / Airbnb / Open Source / Free).
  • Apache Argo - (CNCF foundation / Kubernetes-friendly / Open Source / Free).
  • Apache NiFi - (Apache foundation / Dataflow / Free).
  • Luigi - (Spotify / Open Source / Free).
  • Azure Data Factory - (Azure Cloud / Subscription fee).
  • Apache Flume - (Apache foundation / Data Ingestion / Open Source / Free).
  • Stitch - (Talend / ETL / Subscription fee).
  • Logstash - (Elastic / Data Ingestion / Cloud or On-prem / Hybrid fee).
  • Filebeat - (Elastic / Data Ingestion / Cloud or On-prem / Hybrid fee).
  • Fluentd - (CNCF foundation / Open Source / Free or License fee).
  • Datadog - (Datadog / Cloud / APM / Subscription fee).
  • New Relic - (New Relic / Cloud / APM / Subscription fee).
  • Hadoop File System (HDFS) - (Hybrid / Open Source / Hadoop-ecosystem).
  • AWS S3 - (AWS Cloud / Stroage Object).
  • Azure Blob Storage - (Azure Cloud / Storage Object).
  • GCP Cloud Storage - (Google Cloud / Storage Object).
  • Databricks Delta Lake - (Hybrid / Multi-cloud / Open Source).
  • Aapache Hive - (Apache foundation / Hadoop-friendly / MapReduce / Free).
  • Snowflake - (Multi-cloud / SQL-friendly / Subscription fee).
  • AWS Redshift - (AWS Cloud / SQL-friendly / Subscription fee).
  • Azure Synapse Analytics - (Azure Cloud / SQL-friendly / Subscription fee).
  • GCP BigQuery - (Google Cloud / SQL-friendly / On-demand fee).
  • IBM DB2 - (IBM / On-prem / SQL-friendly / Subscription fee).
  • Apache Druid - (Apache foundation / Real-time datastore / Free).
  • Apache Pinot - (Apache foundation / Real-time datastore / Free).
  • AWS Aurora - (AWS Cloud / Rich-cloud datastore / Subscription fee).
  • GCP Cloud Spanner - (Google Cloud / HA datastore that breaks away from CAP / Subscription fee).
  • Azure Cosmos DB - (Azure Cloud / NoSQL datastore / Subscription fee).
  • Presto - (Facebook / Open Source / SQL-friendly / Free or License fee).
  • Apache Impala - (Apache foundation / Cloudera / Open Source / SQL-friendly / Free or License fee).
  • AWS Athena - (AWS Cloud / SQL-friendly / On-demand fee).
  • AWS Redshift Spectrum - (AWS Cloud / SQL-friendly / On-demand fee).
  • Apache Kafka - (Apache foundation / Confluent / Linkedin / Message Broker / Open Source / Free or License fee).
  • RabbitMQ - (VMWare / Messaging Queue / Free or License fee).
  • AWS Kinesis - (AWS Cloud / Message Broker / Subscription fee).
  • AWS SQS - (AWS Cloud / Messaging Queue / Subscription fee).
  • GCP PubSub - (Google Cloud / Message Borker / Subscription fee).
  • Azure Event Hub - (Azure Cloud / Messsage Borker / Subscription fee).
  • Apache Spark - (Apache foundation / Databricks / In-memory processing / Open Source / Free or License fee).
  • Apache Beam - (Apache foundation / Google / Data processing / Open Source / Free or License fee).
  • Apache Storm - (Apache foundation / Backtype / Twitter / Stream processing / Open Source / Free).
  • Apache Flink - (Apache foundation / Stream processing / Open Source / Free).
  • AWS Glue - (AWS Cloud / Integrated Data System / ETL / On-demand fee).
  • Apache Superset - (Apache foundation / Airbnb / Business Intelligence (BI) / Open Source / Free).
  • Apache Airpal - (Apache foundation / Airbnb / Query Editor / Open Source / Free).
  • Apache HUE - (Apache foundation / Cloudera / Query Editor / Open Source / Free).
  • Kibana - (Elastic / Dashboard / Hybrid fee).
  • Databricks Notebook - (Databricks / Notebook / Hybrid fee).
  • Jupyter Notebook - (Jupyter / Notebook / Open Source / Free).
  • Pandas - (NumFOCUS / Data processing / Open Source / Free).
  • Plotly - (Plotly / Data visualization / Hybrid fee).
  • Apache Parquet - (Apache foundation / Data Format / Open Source / Free).
  • Apache ORC - (Apache foundation / Hortonworks / Facebook / Data Format / Open Source / Free).
  • Apache Avro - (Apache foundation / Data Format / Open Source / Free).
  • Apache Kudu - (Apache foundation / Cloudera / Data Format / Open Source / Free).
  • Apache Arrow - (Apache foundation / Data Format / Open Source / Free).
  • Delta - (Databricks / Data Format / Free or License fee).
  • JSON - (Data Format / Free).
  • CSV - (Data Format / Free).
  • TSV - (Data Format / Free).
  • HDF5 - (The HDF Group / Data Format / Open Source (licensed by [HDF5](https://www.hdfgroup.org/licenses.)) / Free).
  • Apache Zeppelin - (Apache foundation / Business Intelligence (BI) / Open Source / Free or License fee).
  • Tableau - (Salesforce / Business Intelligence (BI) / Hybrid fee).
  • Redash - (Redash Inc / Databricks / Business Intelligence (BI) / Hybrid fee).
  • Looker - (Looker Data Sciences Inc / Business Intelligence (BI) / Subscription fee).
  • Data Studio - (Google Cloud / Business Intelligence (BI) / Free).
  • PowerBI - (Microsoft / Business Intelligence (BI) / Subscription fee).
  • H2O - (H2O.ai / Model Evaluation / Subscription fee).
  • Feast - (Tecton / Gojek / Feature Store / Open Source / Free).
  • Vertex AI - (Google Cloud / Hybrid Features for AI / Subscription fee).
  • Data Robot - (DataRobot Inc / Feature Engineering / Subscription fee).
  • WandB - (Weights & Biases / Model Evaluation / Subscription fee).
  • Databricks | Data + AI Summit
  • Snowflake | Snowflake Summit
  • Kafka Summit
  • Airflow Summit
  • O'Reilly - Data Pipelines Pocker Reference
  • Manning - Data Pipeline with Apache Airflow
  • Snowflake
  • Databricks