Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/abhinavjain-0104/kafka-real-world-project
https://github.com/abhinavjain-0104/kafka-real-world-project
Last synced: 14 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/abhinavjain-0104/kafka-real-world-project
- Owner: AbhinavJain-0104
- Created: 2024-10-08T12:45:36.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-10-09T10:29:59.000Z (4 months ago)
- Last Synced: 2024-11-16T01:42:02.813Z (3 months ago)
- Language: Java
- Size: 22.5 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Wikimedia Real-time Event Streaming and Analytics
This project demonstrates a real-world application of Apache Kafka for processing and analyzing real-time data streams from Wikimedia. It consists of a Spring Boot-based microservices architecture that ingests, processes, and visualizes recent changes to Wikimedia pages.
Key Features:
Real-time data ingestion from Wikimedia EventStreams API
Apache Kafka integration for reliable and scalable message queuing
Spring Boot microservices architecture (Producer and Consumer)
Data persistence using MySQL and Spring Data JPA
Web-based dashboard for visualizing Wikimedia changes
Modular design for easy extensibility and maintenanceTechnology Stack:
Java 17+
Spring Boot 3.x
Apache Kafka
MySQL
Spring Data JPA
Maven
Thymeleaf (for web UI)Project Structure:
The project is divided into two main modules:
kafka-producer-wikimedia: Responsible for consuming the Wikimedia EventStreams API and producing messages to a Kafka topic.
kafka-consumer-database: Consumes messages from the Kafka topic, persists data to MySQL, and serves a web UI for data visualization.Future Enhancements:
Implement real-time updates using WebSocket for the dashboard
Add data analytics and aggregation features
Integrate with a time-series database for improved performance on large datasets
Implement data validation and error handling for improved reliability