https://github.com/to-infinitee/real-time-data-system-arch
The architecture ingests data via Kafka, processes it in real-time with Spark Streaming, and stores it in Cassandra and Hadoop HDFS. It supports direct data push to apps using WebSockets/HTTP Streaming, with a front-end built on Spring Boot, Bootstrap.js, and Chart.js.
https://github.com/to-infinitee/real-time-data-system-arch
backend data-streaming frontend kafka real-time rest-api websocket
Last synced: 4 months ago
JSON representation
The architecture ingests data via Kafka, processes it in real-time with Spark Streaming, and stores it in Cassandra and Hadoop HDFS. It supports direct data push to apps using WebSockets/HTTP Streaming, with a front-end built on Spring Boot, Bootstrap.js, and Chart.js.
- Host: GitHub
- URL: https://github.com/to-infinitee/real-time-data-system-arch
- Owner: to-infinitee
- Created: 2024-08-22T00:22:53.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-08-22T00:35:02.000Z (almost 2 years ago)
- Last Synced: 2025-03-03T06:46:03.409Z (over 1 year ago)
- Topics: backend, data-streaming, frontend, kafka, real-time, rest-api, websocket
- Homepage:
- Size: 353 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README

This system architecture is designed for real-time and batch data processing in a financial or trading environment. It starts by ingesting data from sources like Market Data, Stock Prices, and Trades into a Kafka Streaming Cluster. From there, the data is processed in real-time using Spark Streaming and stored in a Cassandra data lake or Hadoop HDFS for further use. The architecture supports direct data push to web and mobile apps via WebSockets/HTTP Streaming and provides APIs for interacting with processed data. The front-end leverages technologies like Spring Boot, Bootstrap.js, and Chart.js for responsive user interfaces.
## System Components
### 1. Data Sources
- **Market Data**
- **Stock Prices**
- **Trades**
### 2. Kafka Streaming Cluster
- Acts as the central hub for ingesting data streams from multiple sources.
- Provides scalable, distributed data streaming.
### 3. Data Ingestion
#### Real-Time Processing
- **Apache Spark Streaming** processes the data in real-time.
- Operations: `Transform`, `Aggregate`, `Join`
- Integrated with **Spark MLlib** for machine learning.
#### Batch Processing
- **Apache Spark** handles batch data processing.
- Operations: `Transform`, `Aggregate`, `Join`
### 4. Data Storage
- **Cassandra Data Lake:** Stores processed data, offering high availability and scalability.
- **Hadoop HDFS:** Provides additional distributed storage for large datasets.
### 5. Data Push and API
- **WebSocket/HTTP Streaming:** Directly pushes real-time data to web and mobile applications.
- **REST API:** Enables interaction with processed data using technologies like `Spring Boot`, `Bootstrap.js`, `Chart.js`, `jQuery.js`, and `SockJS.js`.
## Front-End
- Built with **Spring Boot**, **Bootstrap.js**, **Chart.js**, **jQuery.js**, and **SockJS.js**.
- Supports both web and mobile applications with responsive interfaces.
## Key Features
- **Scalability:** Easily handles large volumes of streaming and batch data.
- **Real-Time Processing:** Provides timely updates to applications.
- **High Availability:** Ensured by using distributed systems like Kafka, Cassandra, and Hadoop.
## Use Cases
- Ideal for financial markets, stock trading platforms, and large-scale data analytics.
## Getting Started
### Prerequisites
- **Apache Kafka**
- **Apache Spark**
- **Cassandra**
- **Hadoop HDFS**
- **Spring Boot** for REST APIs and web services
### Installation
1. Set up Kafka for streaming data ingestion.
2. Configure Spark for both streaming and batch processing.
3. Install and configure Cassandra and Hadoop HDFS for data storage.
4. Set up the REST API and front-end components using Spring Boot and related web technologies.
### Running the System
- Start Kafka for data ingestion.
- Initiate Spark Streaming and Batch jobs.
- Push data to the front-end using WebSocket/HTTP Streaming.
## License
This project is licensed under the MIT License.