Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with data-warehouse

A curated list of projects in awesome lists tagged with data-warehouse .

https://github.com/greenplum-db/gpdb

Greenplum Database - Massively Parallel PostgreSQL for Analytics. An open-source massively parallel data platform for analytics, machine learning and AI.

analytics data-warehouse database gpdb greenplum-database htap mpp postgresql

Last synced: 29 Sep 2024

https://github.com/hydradatabase/hydra

Hydra: Column-oriented Postgres. Add scalable analytics to your project in minutes.

data-warehouse datawarehouse postgres postgresql postgresql-extension

Last synced: 27 Sep 2024

https://github.com/blankerl/dxy-covid-19-data

2019新型冠状病毒疫情时间序列数据仓库 | COVID-19/2019-nCoV Infection Time Series Data Warehouse

2019-ncov data-warehouse

Last synced: 30 Sep 2024

https://github.com/dlt-hub/dlt

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️

data data-engineering data-lake data-loading data-warehouse elt extract load python transform

Last synced: 31 Jul 2024

https://github.com/elementary-data/elementary

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

analytics-engineer bigquery data-analysis data-governance data-lineage data-observability data-pipeline data-pipelines data-reliability data-warehouse dataops dbt dbt-artifacts dbt-packages lineage redshift snowflake

Last synced: 30 Sep 2024

https://github.com/DataBrewery/cubes

[NOT MAINTAINED] Light-weight Python OLAP framework for multi-dimensional data analysis

cube data data-analysis data-warehouse multidimensional-analysis olap sql

Last synced: 31 Jul 2024

https://github.com/cloudera/hue

Open source SQL Query Assistant service for Databases/Warehouses

autocomplete compose data-warehouse databases query-editor sql sql-assistant sql-editor

Last synced: 30 Sep 2024

https://github.com/googlecloudplatform/bigquery-utils

Useful scripts, udfs, views, and other utilities for migration and data warehouse operations in BigQuery.

bigquery data-warehouse google-cloud-platform sql utilities

Last synced: 01 Oct 2024

https://github.com/GoogleCloudPlatform/bigquery-utils

Useful scripts, udfs, views, and other utilities for migration and data warehouse operations in BigQuery.

bigquery data-warehouse google-cloud-platform sql utilities

Last synced: 02 Aug 2024

https://github.com/raystack/optimus

Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.

airflow analytics analytics-engineering automation bigquery business-intelligence data-modelling data-pipelines data-transformation data-warehouse dataops elt etl golang workflows

Last synced: 29 Sep 2024

https://github.com/Multiwoven/multiwoven

🔥🔥🔥 Open Source Alternative to Hightouch, Census, and RudderStack. Leading Reverse ETL and Customer Data Platform (CDP) for Data Teams.

bigquery cdp customer-data-platform data-activation data-engineering data-pipeline data-warehouse databricks dbt etl hacktoberfest open-source postresql react redshift reverse-etl ruby self-hosted snowflake typescript

Last synced: 01 Aug 2024

https://github.com/multiwoven/multiwoven

🔥🔥🔥 Open Source Alternative to Hightouch, Census, and RudderStack. Leading Reverse ETL and Customer Data Platform (CDP) for Data Teams.

bigquery cdp customer-data-platform data-activation data-engineering data-pipeline data-warehouse databricks dbt etl hacktoberfest open-source postresql react redshift reverse-etl ruby self-hosted snowflake typescript

Last synced: 29 Sep 2024

https://github.com/domainmod/domainmod

DomainMOD is an open source application written in PHP & MySQL used to manage your domains and other internet assets in a central location. DomainMOD also includes a Data Warehouse framework that allows you to import your web server data so that you can view, export, and report on your live data.

cpanel data-warehouse domains hacktoberfest mariadb mysql php whm

Last synced: 27 Sep 2024

https://github.com/cloudberrydb/cloudberrydb

Cloudberry Database - Open source alternative to Greenplum Database. Created by the original Greenplum developers.

ai cloudberrydb data-analysis data-warehouse database database-management gpdb greenplum greenplum-database mpp olap postgres postgresql postgresql-database sql

Last synced: 31 Jul 2024

https://github.com/ubisoft/mobydq

:whale: Tool to automate data quality checks on data pipelines

big-data data-pipeline data-quality data-quality-checks data-quality-monitoring data-warehouse

Last synced: 02 Aug 2024

https://github.com/gokumohandas/data-engineering

Construct a modern data stack and orchestration the workflows to create high quality data for analytics and ML applications.

airflow data-engineering data-warehouse dbt etl machine-learning mlops orchestration

Last synced: 03 Oct 2024

https://github.com/unytics/airbyte_serverless

Airbyte made simple (no UI, no database, no cluster)

airbyte bigquery data data-analysis data-engineering data-warehouse elt etl pipeline

Last synced: 29 Sep 2024

https://github.com/Rello/analytics

Analytics - Open source data warehouse and reporting for Nextcloud

analytics data data-warehouse datasources nextcloud visualization

Last synced: 01 Aug 2024

https://github.com/scottpersinger/pgwarehouse

Easily sync your Postgres database to a Snowflake, ClickHouse, or DuckDB warehouse.

analytics clickhouse data-warehouse postgres postgresql snowflake synchronization warehouse

Last synced: 03 Sep 2024

https://github.com/MassStreetAnalytics/etl-framework

A framework for moving data into a data warehouse.

data-warehouse etl etl-components etl-framework etl-pipeline python sql sqlserver

Last synced: 08 Aug 2024

https://github.com/umer7/Data-Warehouse-Concepts-Design-and-Data-Integration

Repo for Data Warehouse Concepts, Design, and Data Integration by University of Colorado System (coursera)(Notes,Assignments, quiz and research papers)

data-integration data-warehouse datawarehouse oracle pentaho

Last synced: 08 Aug 2024

https://github.com/googlecloudplatform/datacatalog-connectors-hive

Sample code with integration between Data Catalog and Hive data source.

analytics apache-atlas data-warehouse datacatalog gcp hive hive-metastore metadata-management python

Last synced: 28 Sep 2024

https://github.com/fulldecent/google-sheets-etl

Live import all your Google Sheets to your data warehouse

data-vault data-warehouse etl

Last synced: 04 Aug 2024

https://github.com/bondxue/Data-Warehouse-with-AWS

:mushroom:Udacity Data Engineering Nanodegree Project 3

aws-s3 data-warehouse redshift

Last synced: 13 Aug 2024

https://github.com/AuFeld/Data_Engineering_Projects

A collection of data engineering projects: data modeling, ETL pipelines, data lakes, infrastructure configuration on AWS, data warehousing, containerization, and a dashboard to monitor data pipeline KPIs

airflow aws cassandra data-engineering data-lake data-warehouse docker emr etl-pipeline infrastructure-as-code infrastructure-setup postgresql python redshift s3 spark

Last synced: 13 Aug 2024

https://github.com/namdnguyen/dbt-tutorial

DBT tutorial project with Kimball data warehouse modeling, jinja templating, and schema tests.

analytics data-warehouse dbt sql tutorial

Last synced: 08 Aug 2024

https://github.com/maxinexiong/cloud-data-warehousing-with-aws-redshift

This project builds a cloud-based ETL pipeline for Sparkify to move data to a cloud data warehouse. It extracts song and user activity data from AWS S3, stages it in Redshift, and transforms it into a star-schema data model with fact and dimension tables, enabling efficient querying to answer business questions.

aws-boto3 aws-redshift aws-s3 cloud-data-warehouse data-warehouse data-warehousing dimensional-model dimensional-modeling etl etl-pipeline extract-transform-load infrastructure-as-code postgresql postgresql-database redshift-cluster

Last synced: 26 Sep 2024

https://github.com/yasarsultan/olist_datawarehouse

An end-to-end data pipeline that extracts data, processes it, and then loads it into the BigQuery data warehouse.

airflow bigquery data-warehouse docker

Last synced: 29 Sep 2024

https://github.com/vaxdata22/nosql-and-big-data-demonstration

This is a fun assignment task I undertook to explore the world of NoSQL and Big Data. technologies.

apache-hive cassandra-cql cypher-query-language data-warehouse hadoop-hdfs json mongodb neo4j nosql-databases redis

Last synced: 26 Sep 2024

https://github.com/ivdatahub/pypi-package-stats

Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more

bigquery cloud data-engineering data-warehouse gcp software-engineering

Last synced: 29 Sep 2024