https://github.com/codope/hudiinaction
https://github.com/codope/hudiinaction
Last synced: 11 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/codope/hudiinaction
- Owner: codope
- Created: 2025-06-22T14:05:10.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-08-12T16:02:03.000Z (11 months ago)
- Last Synced: 2025-08-14T21:07:36.794Z (11 months ago)
- Language: Scala
- Size: 76.3 MB
- Stars: 0
- Watchers: 0
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Hudi In Action
Welcome to the code repository for the upcoming book **"Hudi In Action"**. This repository contains hands-on examples, tutorials, and code samples that demonstrate Apache Hudi's capabilities for building robust data lakes.
## 📖 About the Book
"Hudi In Action" is a comprehensive guide to [Apache Hudi](https://hudi.apache.org/), covering everything from basic concepts to advanced production patterns. The book provides practical examples and real-world scenarios to help you master Hudi for your data engineering needs.
## 🗂️ Repository Structure
```
hudiinaction/
├── chapter02/ # Getting Started with Hudi
│ ├── hudi_pipeline_quickstart.scala # Comprehensive Hudi tutorial
│ ├── trips_0.gz # NYC Taxi dataset sample (~1M rows)
│ └── README.md # Chapter-specific instructions
└── README.md # This file
```
Each chapter contains its own README with specific learning objectives, setup instructions, and detailed guidance.
## 📋 Prerequisites
Before running the examples, ensure you have:
### Software Requirements
- **Apache Spark** 3.5+ with Scala 2.12
- **Java** 8 or 11
- **Apache Hudi** 1.0.2+ (included via Spark packages)
### Hardware Requirements
- At least 4GB RAM available for Spark
- 2+ CPU cores recommended
- ~2GB disk space for sample data and tables
## 🚀 Getting Started
1. Clone this repository
2. Navigate to the chapter you want to explore
3. Follow the chapter-specific README for detailed setup instructions
4. Each chapter is self-contained with its own dataset and examples
## 🤝 Contributing
Found an issue or want to improve the examples? Contributions are welcome! Please:
1. Fork the repository
2. Create a feature branch
3. Make your changes with clear commit messages
4. Submit a pull request
## 📄 License
This code is provided as supplementary material for "Hudi In Action". Please refer to the book's license terms for usage restrictions.
## 📧 Support
For questions about the book or code examples:
- Check the [Issues](https://github.com/your-username/hudiinaction/issues) page
- Refer to the [Apache Hudi documentation](https://hudi.apache.org/)
- Visit the [Apache Hudi community](https://hudi.apache.org/community/)
---
Happy learning with Apache Hudi! 🎉