An open API service indexing awesome lists of open source software.

https://github.com/jayhan94/minilake

A morden mini lakehouse based on Spark and Delta running in the docker.
https://github.com/jayhan94/minilake

analytics datalake deltalake lakehouse spark

Last synced: 7 months ago
JSON representation

A morden mini lakehouse based on Spark and Delta running in the docker.

Awesome Lists containing this project

README

          

# MiniLake
A morden mini lakehouse based on Spark and Iceberg running in the docker.

# Usage
Build and run
```bash
docker compose up --build
```

Attach the spark container
```bash
docker exec -it spark-iceberg /opt/spark/bin/spark-sql
```

Create table
```SQL
CREATE TABLE student (id INT, name STRING, age INT) USING ICEBERG LOCATION 's3://minilake/student';
```

Insert data
```SQL
INSERT INTO student VALUES (1, 'jay', 15), (2, 'dove', 15);
```

Execute query
```SQL
SELECT * FROM student;
```

# TODO
1. A standalone catalog server.
2. Ingesting real-time data from Kafka.
3. CDC.