Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/angryelizar/dataprocessor

Spring Boot apps for XML to JSON conversion and JSON file processing with RabbitMQ.
https://github.com/angryelizar/dataprocessor

java rabbitmq rest-api spring-boot

Last synced: 8 days ago
JSON representation

Spring Boot apps for XML to JSON conversion and JSON file processing with RabbitMQ.

Awesome Lists containing this project

README

        

# Data Processor

## Overview

Data Processor consists of two Spring Boot applications designed for data processing and management:

1. **Xml2Json App**: Converts XML messages to JSON format, saves them to files, and integrates with CloudAMQP for message brokering.
2. **BatchSplitter App**: Processes JSON messages from a broker, splits records into multiple files with a maximum of 100 entries each, and manages file creation based on data type and date.

## Xml2Json App

### Description

The Xml2Json App performs the following tasks:

- Receives XML messages and converts them to JSON format.
- Saves the JSON data into files, including a record count and an array of `Data` objects.
- Uses Swagger for API documentation.
- Integrates with CloudAMQP for message brokering, creating queues, and sending messages.
- Stores JSON files in the `data/json` directory.

### Technologies Used

- Spring Boot
- CloudAMQP
- Swagger
- RabbitMQ

### File Format

**JSON Example:**
```json
{
"Data": {
"Method": {
"Name": "Order",
"Type": "Services",
"Assembly": "ServiceRepository, Version=1.0.0.1, Culture=neutral, PublicKeyToken=null"
},
"Process": {
"Name": "scheduler.exe",
"Id": "185232",
"Start": {
"Epoch": "1464709722277",
"Date": "2016-05-31T12:07:42.2771759+03:00"
}
},
"Layer": "DailyScheduler",
"Creation": {
"Epoch": "1464709728500",
"Date": "2016-05-31T07:48:21.5007982+03:00"
},
"Type": "Information"
}
}
```
## BatchSplitter App

### Description

The BatchSplitter App is designed to process and manage JSON records received from the Xml2Json App. It performs the following tasks:

- **Reads JSON Files**: Retrieves JSON records from files located in the `data/json` directory, which were generated by the Xml2Json App.
- **Splits Records**: Organizes records into multiple files, each containing a maximum of 100 entries.
- **File Management**: Creates new files in the `data/filteredJson` directory based on data type and date, with a sequential file index to maintain order and ensure proper segmentation.
- **Resumable Processing**: Allows the application to be stopped and restarted, continuing from the last processed file to ensure no data is missed.

### Technologies Used

- Spring Boot
- RabbitMQ

### File Management

**Original File Format:**

- Files are named based on type and date, e.g., `Information-2016-05-31.log`, located in `data/json`.

**Split File Format:**

- Split files are named with an additional index to distinguish multiple files for the same date, e.g., `Information-2016-05-31-0001.log`, `Information-2016-05-31-0002.log`, etc., and are saved in `data/filteredJson`.

### Key Features

- **Dynamic File Creation**: Automatically creates new files in `data/filteredJson` based on the type and date of the data.
- **Indexing**: Each file is indexed to maintain a sequence, crucial for orderly data management.
- **Efficient Data Handling**: Ensures that each file contains a maximum of 100 records, aiding in the effective management of large volumes of data.
- **Fault Tolerance**: Capable of resuming from the last processed file, ensuring robustness and reliability in data processing.

### How to Use

1. **Configuration**: Ensure that the application properties are correctly set up to match the file paths and RabbitMQ settings.
2. **Run the Application**: Start the BatchSplitter App using Maven:
```bash
mvn spring-boot:run
3. **Monitor and Verify**: Check the output files in the `data/filteredJson` directory to ensure that records are properly split and indexed.

### Example

**Input Directory:**

- `data/json` contains files like `Information-2016-05-31.log`.

**Output Directory:**

- `data/filteredJson` will contain split files such as `Information-2016-05-31-0001.log` (Contains records 1 to 100) and `Information-2016-05-31-0002.log` (Contains records 101 to 200).

### License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

### Acknowledgments

- Spring Boot for simplifying application development.
- RabbitMQ for providing reliable message brokering.