Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with lakehouse

A curated list of projects in awesome lists tagged with lakehouse .

https://github.com/prestodb/presto

The official home of the Presto distributed SQL query engine for big data

big-data data hadoop hive java lakehouse presto query sql

Last synced: 29 Sep 2024

https://github.com/apache/doris

Apache Doris is an easy-to-use, high performance and unified analytics database.

bigquery database dbt delta-lake elt etl hadoop hive hudi iceberg lakehouse olap query-engine real-time redshift snowflake spark sql

Last synced: 29 Sep 2024

https://github.com/apache/incubator-doris

Apache Doris is an easy-to-use, high performance and unified analytics database.

bigquery database dbt delta-lake elt etl hadoop hive hudi iceberg lakehouse olap query-engine real-time redshift snowflake spark sql

Last synced: 04 Aug 2024

https://github.com/StarRocks/starrocks

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.

analytics big-data cloudnative database datalake delta-lake distributed-database hudi iceberg join lakehouse lakehouse-platform mpp olap real-time-analytics real-time-updates realtime-database sql star-schema vectorized

Last synced: 30 Jul 2024

https://github.com/starrocks/starrocks

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.

analytics big-data cloudnative database datalake delta-lake distributed-database hudi iceberg join lakehouse lakehouse-platform mpp olap real-time-analytics real-time-updates realtime-database sql star-schema vectorized

Last synced: 29 Sep 2024

https://github.com/lakesoul-io/LakeSoul

LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.

arrow big-data datafusion datalake flink huggingface lakehouse lakesoul postgresql python pytorch rust spark sql streaming vectorized velox

Last synced: 31 Jul 2024

https://github.com/lakesoul-io/lakesoul

LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.

arrow big-data datafusion datalake flink huggingface lakehouse lakesoul postgresql python pytorch rust spark sql streaming vectorized velox

Last synced: 28 Sep 2024

https://github.com/ytsaurus/ytsaurus

YTsaurus is a scalable and fault-tolerant open-source big data platform.

big-data clickhouse distributed-database lakehouse olap-database spark sql ytsaurus

Last synced: 28 Sep 2024

https://github.com/apache/amoro

Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.

bigdata datalake lakehouse

Last synced: 01 Aug 2024

https://github.com/apache/gravitino

World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.

ai-catalog data-catalog datalake federated-query lakehouse metadata metalake model-catalog opendatacatalog skycomputing stratosphere

Last synced: 30 Sep 2024

https://github.com/apache/Gravitino

World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.

ai-catalog data-catalog datalake federated-query lakehouse metadata metalake model-catalog opendatacatalog skycomputing stratosphere

Last synced: 28 Sep 2024

https://github.com/qinsql/QinSQL

AI 时代的智能数据库

lakehouse olap oltp

Last synced: 01 Aug 2024

https://github.com/data-dot-all/dataall

A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.

aws aws-glue aws-lake-formation aws-s3 data data-science etl-framework lakeformation lakehouse redshift

Last synced: 13 Aug 2024

https://github.com/databricks/terraform-databricks-examples

Examples of using Terraform to deploy Databricks resources

aws azure databricks databricks-module gcp lakehouse terraform terraform-module

Last synced: 26 Sep 2024

https://github.com/icelake-io/icelake

Pure Rust Iceberg Implementation

iceberg lakehouse rust

Last synced: 31 Jul 2024

https://github.com/adidas/lakehouse-engine-docs

The Goal of this project is to provide documentation for the Lakehouse Engine framework.

big-data data-engineering data-quality databricks delta-lake framework great-expectations lakehouse lakehouse-engine spark

Last synced: 28 Sep 2024