https://github.com/joshuapowell/preparing-your-mainframe-data-for-machine-learning
Mainframe Data Wrangling: Preparing Your Mainframe Data for Machine Learning
https://github.com/joshuapowell/preparing-your-mainframe-data-for-machine-learning
api-usability data-engineering data-wrangling large-scale-computing machine-learning mainframe
Last synced: 5 months ago
JSON representation
Mainframe Data Wrangling: Preparing Your Mainframe Data for Machine Learning
- Host: GitHub
- URL: https://github.com/joshuapowell/preparing-your-mainframe-data-for-machine-learning
- Owner: joshuapowell
- License: apache-2.0
- Created: 2024-11-14T13:58:47.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-04-21T13:40:38.000Z (about 1 year ago)
- Last Synced: 2025-09-05T07:28:27.848Z (9 months ago)
- Topics: api-usability, data-engineering, data-wrangling, large-scale-computing, machine-learning, mainframe
- Language: TeX
- Homepage:
- Size: 178 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project
README
# Mainframe Data Wrangling: Preparing Your Data for Use in Machine Learning Models
Mainframe computing continues to drive the global economy, with
forty-five of the world's top fifty banks \[1\] handling critical
transaction data through the IBM Z mainframe platform. While recent
research highlights the importance of mainframe modernization rather
than replacement \[2\], enterprises struggle to effectively utilize
mainframe data for automation and optimization due to data-driven and
communication-driven failures \[3\]. This challenge creates a
significant gap between available mainframe capabilities and realized
business value \[4\].
To address this gap, we conducted semi-structured interviews with
eighteen participants across three roles: mainframe subject matter
experts (SME) \[n=6\], mainframe individual contributor end-users
\[n=9\], and mainframe people manager end-users \[n=3\]. The study,
conducted between \[dates\], investigated two research questions:
\[RQ1\] What are the primary use cases in which mainframe network and
network security data impacts mean-time-to-resolution (MTTR) in top
fifty banks? And \[RQ2\] What mainframe data sources and methods do
end-users employ to resolve these network and network security issues?
Analysis revealed that 91% of participants \[5\] encountered data
quality or completeness issues that impeded network problem resolution.
This tutorial demonstrates how end-users can leverage exploratory data
analysis techniques \[6\], mainframe data APIs \[7\], and open source
data science tools \[8\] to prepare data for advanced analytics and
machine learning applications. The presented methodology aims to reduce
MTTR by addressing identified data-driven and communication-driven
failure points \[9\], with specific focus on network security use cases
\[10-13\].
**Keywords:** Data Engineering, Data Wrangling, API Usability, API
Onboarding, Mainframe, Large-scale Computing, Machine Learning
## References
1. IBM (International Business Machines Corporation). (2023). IBM 2023 Annual Report.
2. Wishart-Smith, H. (2024, November 13). Mainframes: the backbone of the
worldwide economy. Forbes.
3. Ryseff, J., de Bruhl, B., & Newberry, S. J. (2024). The Root Causes of
Failure for Artificial Intelligence Projects and How They Can Succeed:
Avoiding the Anti-Patterns of AI.
4. IBM z/OS operating system. (Accessed: October 28, 2024).
https://www.ibm.com/products/zos
5. Mcgregor, S. E. (2022). Practical Python Data Wrangling and Data
Quality. http://oreilly.com
6. Powell, J.I., Broadcom Mainframe Software Division, Internal Study,
April 2024
7. Alam, A., Bales, R., Dumir, V., Kunze, N., Li, J., Mishra, S., Rivera,
E., Wan, M., & Yu, Y. (2024). Turning Data into Insight with Machine
Learning for IBM z/OS (First). International Business Machines
Corporation.
8. Broadcom Mainframe Developer Portal. (Accessed: October 28, 2024).
https://integration.mainframe.broadcom.com/
9. Harrell, M. (2024). Mainframe Application Developer Study.
10. Kanvar, V., Tamilselvam, S., & Raghunath, K. N. (2024, August 8).
Enabling communication via APIs for mainframe applications. arXiv.org.
https://arxiv.org/abs/2408.04230
11. Dau, A. T., V., Dao, H. T., Nguyen, A. T., Tran, H. T., Nguyen, P. X., &
Bui, N. D. Q. (2024, August 5). XMainframe: a large language model for
mainframe modernization. arXiv.org. https://arxiv.org/abs/2408.04660
12. Raju, J., Modernizing Mainframe Workloads in Banking: Embracing the
Power of Hyperscalers, International Journal of Computer Engineering and
Technology (IJCET), 15(5), 2024, pp. 366-374.
13. Raju, J., AI-Driven Transformation of Mainframe Environments: A
Comprehensive Framework for Operational Resilience, International
Journal of Engineering and Technology Research (IJETR), 9(2), 2024, pp.
420--433.