{"id":19763221,"url":"https://github.com/angryelizar/dataprocessor","last_synced_at":"2026-05-13T12:37:48.941Z","repository":{"id":256032702,"uuid":"854149751","full_name":"angryelizar/dataProcessor","owner":"angryelizar","description":"Spring Boot apps for XML to JSON conversion and JSON file processing with RabbitMQ.","archived":false,"fork":false,"pushed_at":"2024-09-11T22:38:10.000Z","size":41,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-28T10:17:15.867Z","etag":null,"topics":["java","rabbitmq","rest-api","spring-boot"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/angryelizar.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-09-08T14:06:17.000Z","updated_at":"2024-09-11T22:39:37.000Z","dependencies_parsed_at":"2024-09-12T00:17:46.230Z","dependency_job_id":null,"html_url":"https://github.com/angryelizar/dataProcessor","commit_stats":null,"previous_names":["angryelizar/dataprocessor"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/angryelizar/dataProcessor","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/angryelizar%2FdataProcessor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/angryelizar%2FdataProcessor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/angryelizar%2FdataProcessor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/angryelizar%2FdataProcessor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/angryelizar","download_url":"https://codeload.github.com/angryelizar/dataProcessor/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/angryelizar%2FdataProcessor/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259519788,"owners_count":22870368,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["java","rabbitmq","rest-api","spring-boot"],"created_at":"2024-11-12T04:08:31.130Z","updated_at":"2025-10-27T10:06:49.983Z","avatar_url":"https://github.com/angryelizar.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Data Processor\n\n## Overview\n\nData Processor consists of two Spring Boot applications designed for data processing and management:\n\n1. **Xml2Json App**: Converts XML messages to JSON format, saves them to files, and integrates with CloudAMQP for message brokering.\n2. **BatchSplitter App**: Processes JSON messages from a broker, splits records into multiple files with a maximum of 100 entries each, and manages file creation based on data type and date.\n\n## Xml2Json App\n\n### Description\n\nThe Xml2Json App performs the following tasks:\n\n- Receives XML messages and converts them to JSON format.\n- Saves the JSON data into files, including a record count and an array of `Data` objects.\n- Uses Swagger for API documentation.\n- Integrates with CloudAMQP for message brokering, creating queues, and sending messages.\n- Stores JSON files in the `data/json` directory.\n\n### Technologies Used\n\n- Spring Boot\n- CloudAMQP\n- Swagger\n- RabbitMQ\n\n### File Format\n\n**JSON Example:**\n```json\n{\n  \"Data\": {\n    \"Method\": {\n      \"Name\": \"Order\",\n      \"Type\": \"Services\",\n      \"Assembly\": \"ServiceRepository, Version=1.0.0.1, Culture=neutral, PublicKeyToken=null\"\n    },\n    \"Process\": {\n      \"Name\": \"scheduler.exe\",\n      \"Id\": \"185232\",\n      \"Start\": {\n        \"Epoch\": \"1464709722277\",\n        \"Date\": \"2016-05-31T12:07:42.2771759+03:00\"\n      }\n    },\n    \"Layer\": \"DailyScheduler\",\n    \"Creation\": {\n      \"Epoch\": \"1464709728500\",\n      \"Date\": \"2016-05-31T07:48:21.5007982+03:00\"\n    },\n    \"Type\": \"Information\"\n  }\n}\n```\n## BatchSplitter App\n\n### Description\n\nThe BatchSplitter App is designed to process and manage JSON records received from the Xml2Json App. It performs the following tasks:\n\n- **Reads JSON Files**: Retrieves JSON records from files located in the `data/json` directory, which were generated by the Xml2Json App.\n- **Splits Records**: Organizes records into multiple files, each containing a maximum of 100 entries.\n- **File Management**: Creates new files in the `data/filteredJson` directory based on data type and date, with a sequential file index to maintain order and ensure proper segmentation.\n- **Resumable Processing**: Allows the application to be stopped and restarted, continuing from the last processed file to ensure no data is missed.\n\n### Technologies Used\n\n- Spring Boot\n- RabbitMQ\n\n### File Management\n\n**Original File Format:**\n\n- Files are named based on type and date, e.g., `Information-2016-05-31.log`, located in `data/json`.\n\n**Split File Format:**\n\n- Split files are named with an additional index to distinguish multiple files for the same date, e.g., `Information-2016-05-31-0001.log`, `Information-2016-05-31-0002.log`, etc., and are saved in `data/filteredJson`.\n\n### Key Features\n\n- **Dynamic File Creation**: Automatically creates new files in `data/filteredJson` based on the type and date of the data.\n- **Indexing**: Each file is indexed to maintain a sequence, crucial for orderly data management.\n- **Efficient Data Handling**: Ensures that each file contains a maximum of 100 records, aiding in the effective management of large volumes of data.\n- **Fault Tolerance**: Capable of resuming from the last processed file, ensuring robustness and reliability in data processing.\n\n### How to Use\n\n1. **Configuration**: Ensure that the application properties are correctly set up to match the file paths and RabbitMQ settings.\n2. **Run the Application**: Start the BatchSplitter App using Maven:\n   ```bash\n   mvn spring-boot:run\n3. **Monitor and Verify**: Check the output files in the `data/filteredJson` directory to ensure that records are properly split and indexed.\n\n### Example\n\n**Input Directory:**\n\n- `data/json` contains files like `Information-2016-05-31.log`.\n\n**Output Directory:**\n\n- `data/filteredJson` will contain split files such as `Information-2016-05-31-0001.log` (Contains records 1 to 100) and `Information-2016-05-31-0002.log` (Contains records 101 to 200).\n\n### License\n\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.\n\n### Acknowledgments\n\n- Spring Boot for simplifying application development.\n- RabbitMQ for providing reliable message brokering.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fangryelizar%2Fdataprocessor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fangryelizar%2Fdataprocessor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fangryelizar%2Fdataprocessor/lists"}