Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with elt

A curated list of projects in awesome lists tagged with elt .

https://github.com/airbytehq/airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

bigquery change-data-capture data data-analysis data-collection data-engineering data-integration data-pipeline elt etl java mssql mysql pipeline postgresql python redshift s3 self-hosted snowflake

Last synced: 28 Sep 2024

https://github.com/apache/doris

Apache Doris is an easy-to-use, high performance and unified analytics database.

bigquery database dbt delta-lake elt etl hadoop hive hudi iceberg lakehouse olap query-engine real-time redshift snowflake spark sql

Last synced: 29 Sep 2024

https://github.com/apache/incubator-doris

Apache Doris is an easy-to-use, high performance and unified analytics database.

bigquery database dbt delta-lake elt etl hadoop hive hudi iceberg lakehouse olap query-engine real-time redshift snowflake spark sql

Last synced: 04 Aug 2024

https://github.com/fishtown-analytics/dbt

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

analytics business-intelligence data-modeling dbt-viewpoint elt pypa slack

Last synced: 12 Sep 2024

https://github.com/dbt-labs/dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

analytics business-intelligence data-modeling dbt-viewpoint elt pypa slack

Last synced: 29 Sep 2024

https://github.com/apache/incubator-seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.

apache batch cdc change-data-capture data-ingestion data-integration elt high-performance offline real-time streaming

Last synced: 17 Aug 2024

https://github.com/apache/seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.

apache batch cdc change-data-capture data-ingestion data-integration elt high-performance offline real-time streaming

Last synced: 29 Sep 2024

https://github.com/kestra-io/kestra

Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.

data data-engineering data-integration data-orchestration data-orchestrator data-pipeline data-quality elt etl low-code orchestration pipeline reverse-etl scheduler workflow workflow-engine

Last synced: 29 Sep 2024

https://github.com/dlt-hub/dlt

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️

data data-engineering data-lake data-loading data-warehouse elt extract load python transform

Last synced: 31 Jul 2024

https://github.com/meltano/meltano

Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.

connectors data data-engineering data-pipelines dataops dataops-platform elt extract-data integration loaders meltano meltano-sdk open-source opensource pipelines singer tap taps target targets

Last synced: 01 Oct 2024

https://github.com/TobikoData/sqlmesh

Efficient data transformation and modeling framework that is backwards compatible with dbt.

dataengineering dataops dbt elt etl python sql transformation

Last synced: 31 Jul 2024

https://github.com/dataform-co/dataform

Dataform is a framework for managing SQL based data operations in BigQuery

analytics business-intelligence data-engineering data-pipelines elt etl hacktoberfest

Last synced: 29 Sep 2024

https://github.com/kuwala-io/kuwala

Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demographics data b) Point of Interests from Open Street Map c) Google Popular Times

admin-boundaries data data-integration data-science dbt elt google-trends jupyter kuwala no-code open-data open-source population postgres pyspark python react react-flow scraping spatial-analysis

Last synced: 01 Aug 2024

https://github.com/raystack/optimus

Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.

airflow analytics analytics-engineering automation bigquery business-intelligence data-modelling data-pipelines data-transformation data-warehouse dataops elt etl golang workflows

Last synced: 29 Sep 2024

https://github.com/artie-labs/transfer

Database replication platform that leverages change data capture. Stream production data from databases to your data warehouse (Snowflake, BigQuery, Redshift) in real-time.

apache-kafka bigquery cdc change-data-capture data-integration data-pipelines database debezium elt golang kafka redshift snowflake

Last synced: 29 Sep 2024

https://github.com/quarylabs/quary

Transform data together. Model, test and deploy as a team.

analytics business-intelligence data-modeling elt

Last synced: 31 Jul 2024

https://github.com/Datavault-UK/automate-dv

A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)

data-vault dataengineering datalake datavault datavault20 datawarehouse datawarehousing dbt elt etl metadata snowflake sql

Last synced: 03 Aug 2024

https://github.com/astronomer/astro-sdk

Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.

airflow apache-airflow bigquery dags data-analysis data-science elt etl gcs pandas postgres python s3 snowflake sql sqlite workflows

Last synced: 29 Sep 2024

https://github.com/umitkaanusta/reddit-detective

Play detective on Reddit: Discover political disinformation campaigns, secret influencers and more

analysis analytics api data database elt etl graph graph-database neo4j network politics reddit social social-media social-network

Last synced: 29 Sep 2024

https://github.com/datacoves/dbt-coves

CLI tool for dbt users to simplify creation of staging models (yml and sql) files

analytics bigquery datacoves dbt elt etl jinja python redshift snowflake sql

Last synced: 29 Sep 2024

https://github.com/unytics/airbyte_serverless

Airbyte made simple (no UI, no database, no cluster)

airbyte bigquery data data-analysis data-engineering data-warehouse elt etl pipeline

Last synced: 29 Sep 2024

https://github.com/zsvoboda/dbd

dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.

bigquery csv database database-schemas elt etl excel json mysql parquet postgresql python python3 redshift snowflake sql sqlite xls xlsx

Last synced: 28 Sep 2024

https://github.com/ascrus/getl

A tool for developing and testing ETL and ELT processes for automating the capture, delivery and processing of information in data warehouses on the MicroFocus Vertica platform.

csv dsl elt etl excel hdfs hive impala json kafka sql unit-testing vertica xml

Last synced: 01 Aug 2024

https://github.com/childmindresearch/bids2table

Efficiently index large-scale BIDS neuroimaging datasets and derivatives

arrow bids data-pipeline elt etl neuroimaging parquet

Last synced: 01 Aug 2024

https://github.com/shipyardapp/postgresql-blueprints

Simplified blueprints for building data pipelines with PostgreSQL.

cli data-analysis data-engineering data-pipeline data-science database elt etl postgres postgresql

Last synced: 13 Aug 2024

https://github.com/shipyardapp/amazonathena-blueprints

Simplified blueprints for building data pipelines with Amazon Athena.

amazon-athena athena cli data-analysis data-engineering data-science elt etl

Last synced: 13 Aug 2024

https://github.com/salvatoreamaddio/pipelinewebsite

This a console line application is an Ad-hoc Solution for a client who needed a way of extracting data from their own website and print them onto a spreadsheet.

csharp csharp-app csharp-code elt elt-pipeline excel-export web

Last synced: 28 Sep 2024

https://github.com/davidkhala/etl

Collection of data Extract, Transform, Load

apache-beam dbt elt etl fivetran

Last synced: 02 Oct 2024