Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/angryelizar/dataprocessor
Spring Boot apps for XML to JSON conversion and JSON file processing with RabbitMQ.
https://github.com/angryelizar/dataprocessor
java rabbitmq rest-api spring-boot
Last synced: 8 days ago
JSON representation
Spring Boot apps for XML to JSON conversion and JSON file processing with RabbitMQ.
- Host: GitHub
- URL: https://github.com/angryelizar/dataprocessor
- Owner: angryelizar
- Created: 2024-09-08T14:06:17.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2024-09-11T22:38:10.000Z (2 months ago)
- Last Synced: 2024-09-13T04:30:38.145Z (2 months ago)
- Topics: java, rabbitmq, rest-api, spring-boot
- Language: Java
- Homepage:
- Size: 40 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Data Processor
## Overview
Data Processor consists of two Spring Boot applications designed for data processing and management:
1. **Xml2Json App**: Converts XML messages to JSON format, saves them to files, and integrates with CloudAMQP for message brokering.
2. **BatchSplitter App**: Processes JSON messages from a broker, splits records into multiple files with a maximum of 100 entries each, and manages file creation based on data type and date.## Xml2Json App
### Description
The Xml2Json App performs the following tasks:
- Receives XML messages and converts them to JSON format.
- Saves the JSON data into files, including a record count and an array of `Data` objects.
- Uses Swagger for API documentation.
- Integrates with CloudAMQP for message brokering, creating queues, and sending messages.
- Stores JSON files in the `data/json` directory.### Technologies Used
- Spring Boot
- CloudAMQP
- Swagger
- RabbitMQ### File Format
**JSON Example:**
```json
{
"Data": {
"Method": {
"Name": "Order",
"Type": "Services",
"Assembly": "ServiceRepository, Version=1.0.0.1, Culture=neutral, PublicKeyToken=null"
},
"Process": {
"Name": "scheduler.exe",
"Id": "185232",
"Start": {
"Epoch": "1464709722277",
"Date": "2016-05-31T12:07:42.2771759+03:00"
}
},
"Layer": "DailyScheduler",
"Creation": {
"Epoch": "1464709728500",
"Date": "2016-05-31T07:48:21.5007982+03:00"
},
"Type": "Information"
}
}
```
## BatchSplitter App### Description
The BatchSplitter App is designed to process and manage JSON records received from the Xml2Json App. It performs the following tasks:
- **Reads JSON Files**: Retrieves JSON records from files located in the `data/json` directory, which were generated by the Xml2Json App.
- **Splits Records**: Organizes records into multiple files, each containing a maximum of 100 entries.
- **File Management**: Creates new files in the `data/filteredJson` directory based on data type and date, with a sequential file index to maintain order and ensure proper segmentation.
- **Resumable Processing**: Allows the application to be stopped and restarted, continuing from the last processed file to ensure no data is missed.### Technologies Used
- Spring Boot
- RabbitMQ### File Management
**Original File Format:**
- Files are named based on type and date, e.g., `Information-2016-05-31.log`, located in `data/json`.
**Split File Format:**
- Split files are named with an additional index to distinguish multiple files for the same date, e.g., `Information-2016-05-31-0001.log`, `Information-2016-05-31-0002.log`, etc., and are saved in `data/filteredJson`.
### Key Features
- **Dynamic File Creation**: Automatically creates new files in `data/filteredJson` based on the type and date of the data.
- **Indexing**: Each file is indexed to maintain a sequence, crucial for orderly data management.
- **Efficient Data Handling**: Ensures that each file contains a maximum of 100 records, aiding in the effective management of large volumes of data.
- **Fault Tolerance**: Capable of resuming from the last processed file, ensuring robustness and reliability in data processing.### How to Use
1. **Configuration**: Ensure that the application properties are correctly set up to match the file paths and RabbitMQ settings.
2. **Run the Application**: Start the BatchSplitter App using Maven:
```bash
mvn spring-boot:run
3. **Monitor and Verify**: Check the output files in the `data/filteredJson` directory to ensure that records are properly split and indexed.### Example
**Input Directory:**
- `data/json` contains files like `Information-2016-05-31.log`.
**Output Directory:**
- `data/filteredJson` will contain split files such as `Information-2016-05-31-0001.log` (Contains records 1 to 100) and `Information-2016-05-31-0002.log` (Contains records 101 to 200).
### License
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
### Acknowledgments
- Spring Boot for simplifying application development.
- RabbitMQ for providing reliable message brokering.