Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with data-warehouse
A curated list of projects in awesome lists tagged with data-warehouse .
https://github.com/onurakpolat/awesome-bigdata
A curated list of awesome big data frameworks, ressources and other awesomeness.
awesome awesome-list bigdata data data-analytics data-science data-stream data-visualization data-warehouse database distributed-database series-database stream-processing streaming-data visualize-data
Last synced: 31 Jul 2024
https://github.com/0xnr/awesome-bigdata
A curated list of awesome big data frameworks, ressources and other awesomeness.
awesome awesome-list bigdata data data-analytics data-science data-stream data-visualization data-warehouse database distributed-database series-database stream-processing streaming-data visualize-data
Last synced: 31 Jul 2024
https://github.com/greenplum-db/gpdb
Greenplum Database - Massively Parallel PostgreSQL for Analytics. An open-source massively parallel data platform for analytics, machine learning and AI.
analytics data-warehouse database gpdb greenplum-database htap mpp postgresql
Last synced: 29 Sep 2024
https://github.com/MaterializeInc/materialize
The data warehouse for operational workloads.
data-warehouse database distributed-systems kafka materialized-view operational-data-warehouse postgresql postgresql-dialect rust sql stream-processing streaming streaming-data
Last synced: 31 Jul 2024
https://github.com/materializeinc/materialize
The data warehouse for operational workloads.
data-warehouse database distributed-systems kafka materialized-view operational-data-warehouse postgresql postgresql-dialect rust sql stream-processing streaming streaming-data
Last synced: 29 Sep 2024
https://github.com/rudderlabs/rudder-server
Privacy and Security focused Segment-alternative, in Golang and React
bigquery customer-data customer-data-lake customer-data-pipeline customer-data-platform data-integration data-pipeline data-synchronization data-warehouse etl golang hybrid-cloud privacy redshift rudderstack security segment-alternative snowflake warehouse-first warehouse-management
Last synced: 29 Sep 2024
https://github.com/hydradatabase/hydra
Hydra: Column-oriented Postgres. Add scalable analytics to your project in minutes.
data-warehouse datawarehouse postgres postgresql postgresql-extension
Last synced: 27 Sep 2024
https://github.com/blankerl/dxy-covid-19-data
2019新型冠状病毒疫情时间序列数据仓库 | COVID-19/2019-nCoV Infection Time Series Data Warehouse
Last synced: 30 Sep 2024
https://github.com/dlt-hub/dlt
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
data data-engineering data-lake data-loading data-warehouse elt extract load python transform
Last synced: 31 Jul 2024
https://github.com/elementary-data/elementary
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
analytics-engineer bigquery data-analysis data-governance data-lineage data-observability data-pipeline data-pipelines data-reliability data-warehouse dataops dbt dbt-artifacts dbt-packages lineage redshift snowflake
Last synced: 30 Sep 2024
https://github.com/DataBrewery/cubes
[NOT MAINTAINED] Light-weight Python OLAP framework for multi-dimensional data analysis
cube data data-analysis data-warehouse multidimensional-analysis olap sql
Last synced: 31 Jul 2024
https://github.com/san089/udacity-data-engineering-projects
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
airflow airflow-operators aws aws-ec2 aws-s3 aws-sdk cassandra cassandra-database cloudformation cluster data data-engineering data-engineering-pipeline data-lake data-modeling data-warehouse etl-pipeline infrastructure postgres postgresql-database
Last synced: 29 Sep 2024
https://github.com/tensorbase/tensorbase
TensorBase is a new big data warehousing with modern efforts.
analytics bigdata data data-infrastructure data-warehouse database engineering high-performance infrastructure modern rust rust-lang warehouse
Last synced: 30 Sep 2024
https://github.com/san089/Udacity-Data-Engineering-Projects
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
airflow airflow-operators aws aws-ec2 aws-s3 aws-sdk cassandra cassandra-database cloudformation cluster data data-engineering data-engineering-pipeline data-lake data-modeling data-warehouse etl-pipeline infrastructure postgres postgresql-database
Last synced: 01 Aug 2024
https://github.com/cloudera/hue
Open source SQL Query Assistant service for Databases/Warehouses
autocomplete compose data-warehouse databases query-editor sql sql-assistant sql-editor
Last synced: 30 Sep 2024
https://github.com/scratchdata/scratchdata
Scratch is a swiss army knife for big data.
bigquery clickhouse data-warehouse duckdb hacktoberfest motherduck olap redshift snowflake
Last synced: 26 Sep 2024
https://github.com/googlecloudplatform/bigquery-utils
Useful scripts, udfs, views, and other utilities for migration and data warehouse operations in BigQuery.
bigquery data-warehouse google-cloud-platform sql utilities
Last synced: 01 Oct 2024
https://github.com/GoogleCloudPlatform/bigquery-utils
Useful scripts, udfs, views, and other utilities for migration and data warehouse operations in BigQuery.
bigquery data-warehouse google-cloud-platform sql utilities
Last synced: 02 Aug 2024
https://github.com/alanchn31/data-engineering-projects
Personal Data Engineering Projects
airflow aws-redshift cassandra data-engineering data-engineering-nanodegree data-lake data-modeling data-warehouse ingest-data mongodb postgres scrapy spark star-schema
Last synced: 29 Sep 2024
https://github.com/alanchn31/Data-Engineering-Projects
Personal Data Engineering Projects
airflow aws-redshift cassandra data-engineering data-engineering-nanodegree data-lake data-modeling data-warehouse ingest-data mongodb postgres scrapy spark star-schema
Last synced: 01 Aug 2024
https://github.com/raystack/optimus
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
airflow analytics analytics-engineering automation bigquery business-intelligence data-modelling data-pipelines data-transformation data-warehouse dataops elt etl golang workflows
Last synced: 29 Sep 2024
https://github.com/Multiwoven/multiwoven
🔥🔥🔥 Open Source Alternative to Hightouch, Census, and RudderStack. Leading Reverse ETL and Customer Data Platform (CDP) for Data Teams.
bigquery cdp customer-data-platform data-activation data-engineering data-pipeline data-warehouse databricks dbt etl hacktoberfest open-source postresql react redshift reverse-etl ruby self-hosted snowflake typescript
Last synced: 01 Aug 2024
https://github.com/multiwoven/multiwoven
🔥🔥🔥 Open Source Alternative to Hightouch, Census, and RudderStack. Leading Reverse ETL and Customer Data Platform (CDP) for Data Teams.
bigquery cdp customer-data-platform data-activation data-engineering data-pipeline data-warehouse databricks dbt etl hacktoberfest open-source postresql react redshift reverse-etl ruby self-hosted snowflake typescript
Last synced: 29 Sep 2024
https://github.com/canner/vulcan-sql
Data API Framework for AI Agents and Data Apps
ai ai-agent analytics api-builder bigquery clickhouse data-lake data-warehouse database duckdb ksqldb postgresql reporting restful-api snowflake spreadsheet sql typescript vulcan-sql vulcansql
Last synced: 26 Sep 2024
https://github.com/Canner/vulcan-sql
Data API Framework for AI Agents and Data Apps
ai ai-agent analytics api-builder bigquery clickhouse data-lake data-warehouse database duckdb ksqldb postgresql reporting restful-api snowflake spreadsheet sql typescript vulcan-sql vulcansql
Last synced: 01 Aug 2024
https://github.com/unytics/bigfunctions
Supercharge BigQuery with BigFunctions
bigquery data data-analytics data-engineering data-visualization data-warehouse
Last synced: 29 Sep 2024
https://github.com/domainmod/domainmod
DomainMOD is an open source application written in PHP & MySQL used to manage your domains and other internet assets in a central location. DomainMOD also includes a Data Warehouse framework that allows you to import your web server data so that you can view, export, and report on your live data.
cpanel data-warehouse domains hacktoberfest mariadb mysql php whm
Last synced: 27 Sep 2024
https://github.com/vmware/versatile-data-kit
One framework to develop, deploy and operate data workflows with Python and SQL.
analytics data data-engineer data-engineering data-engineering-pipeline data-lineage data-pipelines data-science data-structures data-warehouse database dataops elt etl pipeline python snowflake sql trino warehouse
Last synced: 03 Aug 2024
https://github.com/cloudberrydb/cloudberrydb
Cloudberry Database - Open source alternative to Greenplum Database. Created by the original Greenplum developers.
ai cloudberrydb data-analysis data-warehouse database database-management gpdb greenplum greenplum-database mpp olap postgres postgresql postgresql-database sql
Last synced: 31 Jul 2024
https://github.com/intermine/intermine
A powerful open source data warehouse system
api bioinformatics biology clojure clojurescript data-visualisation data-visualization data-warehouse genetics genomics java lgplv3 open-source opensource perl postgresql python tomcat tomcat8 webservices
Last synced: 01 Oct 2024
https://github.com/ubisoft/mobydq
:whale: Tool to automate data quality checks on data pipelines
big-data data-pipeline data-quality data-quality-checks data-quality-monitoring data-warehouse
Last synced: 02 Aug 2024
https://github.com/gokumohandas/data-engineering
Construct a modern data stack and orchestration the workflows to create high quality data for analytics and ML applications.
airflow data-engineering data-warehouse dbt etl machine-learning mlops orchestration
Last synced: 03 Oct 2024
https://github.com/dalenewman/transformalize
Configurable Extract, Transform, and Load
data-warehouse denormalize elasticsearch etl etl-framework excel files mysql postgresql solr sql-server sqlce sqlite ssas
Last synced: 28 Sep 2024
https://github.com/dalenewman/Transformalize
Configurable Extract, Transform, and Load
data-warehouse denormalize elasticsearch etl etl-framework excel files mysql postgresql solr sql-server sqlce sqlite ssas
Last synced: 03 Aug 2024
https://github.com/tuva-health/tuva
Main repo including core data model, data marts, reference data, terminology, and the clinical concept library
analytics-engineering bigquery data-analytics data-governance data-lineage data-pipelines data-warehouse dbt dbt-packages healthcare healthcare-analysis healthcare-data open-source redshift snowflake sql terminology
Last synced: 29 Sep 2024
https://github.com/unytics/airbyte_serverless
Airbyte made simple (no UI, no database, no cluster)
airbyte bigquery data data-analysis data-engineering data-warehouse elt etl pipeline
Last synced: 29 Sep 2024
https://github.com/Rello/analytics
Analytics - Open source data warehouse and reporting for Nextcloud
analytics data data-warehouse datasources nextcloud visualization
Last synced: 01 Aug 2024
https://github.com/beneath-hq/beneath
Beneath is a serverless real-time data platform ⚡️
analytics beneath data-engineering data-pipelines data-science data-warehouse dataops developer-tools etl go kubernetes mlops python sql streaming
Last synced: 01 Aug 2024
https://github.com/gclunies/reflekt
Define, govern, and model event data for warehouse-first product analytics.
avo customer-data-platform data-modeling data-quality data-warehouse dbt dbt-package events governance product-analytics schema-registry segment segment-protocols
Last synced: 26 Sep 2024
https://github.com/GClunies/Reflekt
Define, govern, and model event data for warehouse-first product analytics.
avo customer-data-platform data-modeling data-quality data-warehouse dbt dbt-package events governance product-analytics schema-registry segment segment-protocols
Last synced: 05 Sep 2024
https://github.com/scottpersinger/pgwarehouse
Easily sync your Postgres database to a Snowflake, ClickHouse, or DuckDB warehouse.
analytics clickhouse data-warehouse postgres postgresql snowflake synchronization warehouse
Last synced: 03 Sep 2024
https://github.com/MassStreetAnalytics/etl-framework
A framework for moving data into a data warehouse.
data-warehouse etl etl-components etl-framework etl-pipeline python sql sqlserver
Last synced: 08 Aug 2024
https://github.com/umer7/Data-Warehouse-Concepts-Design-and-Data-Integration
Repo for Data Warehouse Concepts, Design, and Data Integration by University of Colorado System (coursera)(Notes,Assignments, quiz and research papers)
data-integration data-warehouse datawarehouse oracle pentaho
Last synced: 08 Aug 2024
https://github.com/fibo/olap-cube
is an hypercube of data
business-intelligence cube data-warehouse data-warehousing dwh olap olap-cube pivot pivot-tables report table
Last synced: 01 Oct 2024
https://github.com/aconstandinou/data-warehouse-build
data data-warehouse postgresql-database python quantitative-finance
Last synced: 13 Aug 2024
https://github.com/googlecloudplatform/datacatalog-connectors-hive
Sample code with integration between Data Catalog and Hive data source.
analytics apache-atlas data-warehouse datacatalog gcp hive hive-metastore metadata-management python
Last synced: 28 Sep 2024
https://github.com/Canner/vulcan-sql-examples
Curated VulcanSQL show cases
analytics api-builder bigquery data data-lake data-warehouse database duckdb examples postgresql reporting restful-api sql vulcan-sql vulcansql
Last synced: 01 Aug 2024
https://github.com/fulldecent/google-sheets-etl
Live import all your Google Sheets to your data warehouse
Last synced: 04 Aug 2024
https://github.com/bondxue/Data-Warehouse-with-AWS
:mushroom:Udacity Data Engineering Nanodegree Project 3
aws-s3 data-warehouse redshift
Last synced: 13 Aug 2024
https://github.com/AuFeld/Data_Engineering_Projects
A collection of data engineering projects: data modeling, ETL pipelines, data lakes, infrastructure configuration on AWS, data warehousing, containerization, and a dashboard to monitor data pipeline KPIs
airflow aws cassandra data-engineering data-lake data-warehouse docker emr etl-pipeline infrastructure-as-code infrastructure-setup postgresql python redshift s3 spark
Last synced: 13 Aug 2024
https://github.com/genenotebook/genenotebook
A collaborative notebook for genes and genomes
d3js data-warehouse expression-browser genome-analysis genome-annotation genome-browser meteorjs mongodb nodejs reactjs
Last synced: 30 Sep 2024
https://github.com/namdnguyen/dbt-tutorial
DBT tutorial project with Kimball data warehouse modeling, jinja templating, and schema tests.
analytics data-warehouse dbt sql tutorial
Last synced: 08 Aug 2024
https://github.com/mongoexpuser/aws-redshift-serverless-with-aws-sdk-js-v3
Deployment and Modeling of AWS Redshift Serverless
aws aws-sdk-3 data-warehouse debian javascript nodejs plpgsql redshift-serverless ubuntu
Last synced: 27 Sep 2024
https://github.com/maxinexiong/cloud-data-warehousing-with-aws-redshift
This project builds a cloud-based ETL pipeline for Sparkify to move data to a cloud data warehouse. It extracts song and user activity data from AWS S3, stages it in Redshift, and transforms it into a star-schema data model with fact and dimension tables, enabling efficient querying to answer business questions.
aws-boto3 aws-redshift aws-s3 cloud-data-warehouse data-warehouse data-warehousing dimensional-model dimensional-modeling etl etl-pipeline extract-transform-load infrastructure-as-code postgresql postgresql-database redshift-cluster
Last synced: 26 Sep 2024
https://github.com/yasarsultan/olist_datawarehouse
An end-to-end data pipeline that extracts data, processes it, and then loads it into the BigQuery data warehouse.
airflow bigquery data-warehouse docker
Last synced: 29 Sep 2024
https://github.com/vaxdata22/nosql-and-big-data-demonstration
This is a fun assignment task I undertook to explore the world of NoSQL and Big Data. technologies.
apache-hive cassandra-cql cypher-query-language data-warehouse hadoop-hdfs json mongodb neo4j nosql-databases redis
Last synced: 26 Sep 2024
https://github.com/manuelandersen/football-pipeline
DE Zoomcamp 2024 Final Project 🧙
bigquery data-engineering data-lake data-warehouse dbt dbt-cloud etl-pipeline google-cloud looker-studio mageai python
Last synced: 29 Sep 2024
https://github.com/ivdatahub/pypi-package-stats
Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more
bigquery cloud data-engineering data-warehouse gcp software-engineering
Last synced: 29 Sep 2024