https://github.com/jesufemi-o/hacking-apache-iceberg
Tequila: My journey to apache iceberg. You'll definitely need your pinch of salt! opinions are mine!!
https://github.com/jesufemi-o/hacking-apache-iceberg
etl-framework iceberg lakehouse-platform
Last synced: 7 months ago
JSON representation
Tequila: My journey to apache iceberg. You'll definitely need your pinch of salt! opinions are mine!!
- Host: GitHub
- URL: https://github.com/jesufemi-o/hacking-apache-iceberg
- Owner: JesuFemi-O
- Created: 2024-07-26T01:40:44.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-08-12T09:32:59.000Z (about 1 year ago)
- Last Synced: 2024-10-14T14:06:33.443Z (12 months ago)
- Topics: etl-framework, iceberg, lakehouse-platform
- Language: Python
- Homepage:
- Size: 44.3 MB
- Stars: 4
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Baby Steps to Learning and Mastering Apache Iceberg: Unruly Edition
There’s been this buzz about Iceberg. Apparently, it’s a table format that has existed for a while but got really, really high amount of attention recently, or at least I saw so much hype around it after Snowflake announced support for Iceberg and unveiled the Polaris catalog, and Databricks announced its acquisition of Tabular.
My biggest confusion has been where to start from. What the hell is Iceberg? Why’s everyone talking about a lakehouse? What is Trino? There’s also Starburst, Tabular, and Dremio??
# Progress
I'm making notes as I progress and you can view them here
- [intro/rough roadmap](./notes/intro.md)
- [Setting up the infra](./notes/setup-infra.md)
- [using pyiceberg to load data](./notes/pyiceberg.md)