{"id":24383621,"url":"https://github.com/bsantanna/iban-validator-model","last_synced_at":"2025-04-11T01:12:25.843Z","repository":{"id":58862745,"uuid":"521639420","full_name":"bsantanna/iban-validator-model","owner":"bsantanna","description":"A Machine Learning Artificial Neural Network Model for validating IBAN account numbers.","archived":false,"fork":false,"pushed_at":"2023-07-16T20:44:47.000Z","size":1114,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-11T01:12:20.095Z","etag":null,"topics":["banking","deep-learning","docker","grpc","iban","iban-validator","java","keras","keras-tensorflow","machine-learning","neural-network","python","rest-api","service","tabular-data","tensorflow","validation"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bsantanna.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2022-08-05T13:00:00.000Z","updated_at":"2024-12-29T11:46:07.000Z","dependencies_parsed_at":"2023-12-18T15:13:19.815Z","dependency_job_id":"1642733f-e118-4160-994b-7cebfe992565","html_url":"https://github.com/bsantanna/iban-validator-model","commit_stats":{"total_commits":9,"total_committers":2,"mean_commits":4.5,"dds":"0.11111111111111116","last_synced_commit":"0b16f1dd393f77d982ae9d98770dcc20a310a15b"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bsantanna%2Fiban-validator-model","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bsantanna%2Fiban-validator-model/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bsantanna%2Fiban-validator-model/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bsantanna%2Fiban-validator-model/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bsantanna","download_url":"https://codeload.github.com/bsantanna/iban-validator-model/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248322571,"owners_count":21084337,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["banking","deep-learning","docker","grpc","iban","iban-validator","java","keras","keras-tensorflow","machine-learning","neural-network","python","rest-api","service","tabular-data","tensorflow","validation"],"created_at":"2025-01-19T10:15:13.484Z","updated_at":"2025-04-11T01:12:25.824Z","avatar_url":"https://github.com/bsantanna.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# IBAN Validator Model\n\n\u003e ML model for validating IBAN account numbers.\n\nThis is an open-source project which delivers a machine learning model for validating [IBAN](https://www.iso13616.org)\naccount numbers accessible via [gRPC](#grpc-service) and [REST](#rest-api) APIs.\n\nThe [trained model](#trained-model) is distributed in two forms:\n\n - As a [Notebook at Kaggle](https://www.kaggle.com/code/brunosantanna/iban-validator-model/notebook) for interactive experimentation.\n - As a [Docker Image](https://hub.docker.com/r/bsantanna/iban-validator-model) which contains all\nruntime [dependencies](#dependencies) and is used now as basis example for cloud related projects.\n\nThe project source code is available at [bsantanna/iban-validator-model](https://github.com/bsantanna/iban-validator-model) Github repository.\n\n\n\n## Table of contents\n\n* [Container Image](#container-image)\n* [Usage example](#usage-example)\n* [Project description](#project-description)\n  * [The challenge](#the-challenge)\n  * [The proposed solution](#the-proposed-solution)\n  * [Dependencies](#dependencies)\n  * [Trained Model](#trained-model)\n  * [Module organization](#module-organization)\n    * [Training](#training)\n    * [Prediction](#prediction)\n      * [gRPC Service](#grpc-service)\n      * [REST API](#rest-api)\n* [Conclusion and references](#conclusion-and-references)\n* [Changelog](#changelog)\n* [License](#license)\n\n\n\n## Container Image\n\nOne deliverable of this project is\na [Docker Image](https://hub.docker.com/r/bsantanna/iban-validator-model) which can be used following\nthis [project license](#license).\n\n\u003e Disclaimer:\n\u003e\n\u003e - The dataset may be outdated in comparison to the latest IBAN registry / country information.\n\u003e - This model was created for case study purposes and may predict incorrect results.\n\u003e - Use the model at your own discretion.\n\nTo run this image with Docker use the following command:\n\n```bash\n$ docker run -it --rm -p 41151:41151 -p 8080:8080 bsantanna/iban-validator-model\n```\n\nAfter some moments a container should spawn two processes:\n\n- A Python [gRPC](#grpc-service) listening on port 41151.\n- A Spring Boot / Java [REST API](#rest-api) listening on port 8080.\n\n\n\n\u003e Apple Silicon users:\n\u003e \n\u003e - To run the docker image use this alternative tag: `bsantanna/iban-validator-model:aarch64`\n\u003e - To run the notebooks of this repository, consider [following these instructions](https://developer.apple.com/metal/tensorflow-plugin/).\n\u003e\n\n\n\n## Usage example\n\n![IBAN validation use case](doc/assets/validation_use_case.png)\n\nWith example IBAN accounts from [https://bank.codes](https://bank.codes/iban/examples/).\n\nIBAN to validation data JSON:\n\n```bash\n$ curl -s \\\n  \"http://localhost:8080/validation?iban=BR1800000000141455123924100C2\"\n```\n\nGives the following result output:\n\n```json\n{\n  \"classification\": {\n    \"bban_regex\": \"[0-9]{23}[A-Z]{1}[A-Za-z0-9]{1}\",\n    \"check_digit_regex\": \"[0-9][0-9]\",\n    \"code\": \"BR\",\n    \"country\": \"Brazil\",\n    \"size\": 29\n  },\n  \"description\": \"IBAN passed validation\",\n  \"iban\": \"BR1800000000141455123924100C2\",\n  \"is_valid\": true\n}\n```\n\n\n## Project description\n\nMain motivation for creating this project was study and self development in the subjects\nof [Deep Learning](https://en.wikipedia.org/wiki/Deep_learning)\nand [Artificial Neural Networks](https://en.wikipedia.org/wiki/Artificial_neural_network)\nusing [Tensorflow](https://www.tensorflow.org), a popular Machine Learning Platform.\n\nConsidering the Artificial Intelligence domain landscape in the year of 2022, several Machine Learning SaaS and PaaS\noffers available in the market in a field of computer science that just had become a mainstream topic. It just made\nsense to me picking one major framework and start practicing with a well known problem such as IBAN validation.\n\nAs a software engineer, with this project I had found answers to some of my practical questions and was rewarded with\nproficiency in modeling neural networks and distributing them for prediction at scale using cloud containers.\n\n### The challenge\n\nLooking for an idea for a hands-on/short-lived project to practice and learn Neural Network modelling\nwith [Keras](https://keras.io/api/) and [TensorFlow](https://www.tensorflow.org/), I came across this idea of creating\nthis [IBAN](https://www.iso13616.org) validator as this is a simple use case with good references over the internet.\n\n### The proposed solution\n\nIn order to employ a simple yet efficient Machine Learning Model, the proposed solution\naddresses the challenge using the following approach.\n\n\u003e 1. The Machine Learning Model should memorize [a table](](https://github.com/bsantanna/iban-validator-model/blob/main/modules/training/data/country_validation_json.csv)) with country\n     specific [Regular Expression](https://en.wikipedia.org/wiki/Regular_expression) rules formatted as static JSON\n     document strings.\n\u003e 2. Predict the correct JSON when other items from the table are given as features, extracted from the\n     input [IBAN](https://www.iso13616.org); 2-letter [ISO 3166-1](https://en.wikipedia.org/wiki/ISO_3166-1) country\n     code and length / size.\n\u003e 3. Use language specific regular expression to parse predicted JSON and validate the\n     input [IBAN](https://www.iso13616.org)\n\n### Dependencies\n\nProject development environment and dependencies:\n\n- [Jupyter](https://jupyter.org): Used as interactive development environment.\n- [Matplotlib](https://matplotlib.org): Used for plotting error loss function result over epochs to measure training\n  results.\n- [Pandas](https://pandas.pydata.org/pandas-docs/stable/): Used as main tabular data manipulation framework.\n- [TensorFlow](https://www.tensorflow.org/api_docs/python/tf): Used as Machine Learning Platform and one of the main\n  topics of interest of this project.\n- [Keras](https://keras.io/api/): Used as Deep Learning and Neural Network fluent design API and another of the main\n  topics of interest of this project.\n- [gRPC](https://www.grpc.io): Used for system integration and Remote Procedure Call Framework.\n- [SpringBoot Web](https://spring.io/projects/spring-boot): Used as Dependency Injection Container, REST Framework API\n  and System Integration proof-of-concept.\n\n### Trained Model\n\nTrained model was constructed using [Keras Functional API](https://keras.io/guides/functional_api/).\n\n![JSON Classification model](doc/assets/json_classification_model.png)\n\n#### Inputs\nAs per illustration above, two input parameters are given to the model:\n\n- Code: 2-letter ISO 3166-1 country code\n- Size: IBAN length\n\n#### Hyper-parameters\n\nIn relation to [hyper-parameters](https://en.wikipedia.org/wiki/Hyperparameter_(machine_learning)), the following configuration was used during training:\n\n- Input dataset 105 with items\n- 500 epochs\n- Mini batch input with size of 32\n- [Adam](https://keras.io/api/optimizers/adam/) as optimizer function, Adam is a computation efficient variant of stochastic gradient descend.\n- [Categorical cross entropy](https://keras.io/api/losses/probabilistic_losses/#categoricalcrossentropy-class) as error loss function.\n\n#### Output\n\nThe output prediction returns an array of probabilities with the maximum probability corresponding to the correct JSON document.\n\nResulting training process can be observed in the following chart:\n\n![Training result](doc/assets/training_result_chart.png)\n\n - x-axis corresponds to training epochs.\n - y-axis corresponds to error loss and accuracy scalar values.\n - Chart indicates a progression in accuracy and reduction in loss over the training epochs.\n\nIn relation to the model fitting, in this use case over-fitting is not an undesired side effect but rather a requirement, the model need to adapt to the specific tabular data and \"memorize\" it. \n\nSpecific details of the model and Neural Network Topology can be observed in the\nnotebook [JSON Prediction Model](https://github.com/bsantanna/iban-validator-model/blob/main/notebooks/training/json/json_prediction_model.ipynb), which served as main development and experiment environment.\n\nA prediction example which loads trained model can be observed in the notebook [JSON Classification](https://github.com/bsantanna/iban-validator-model/blob/main/notebooks/prediction/json_classification.ipynb)\n\n\n\n### Module organization\n\nWhile Jupyter notebooks are great for prototyping purposes, in order to distribute the model code was formatted in a [continuous delivery](https://en.wikipedia.org/wiki/Continuous_delivery) ready structure under the `modules/` directory and that also permitted introduction of simple use case of integration using [Java](#rest-api) and [gRPC](#grpc-service).\n\n#### Training\n\nTraining module contains code used for declaring, compiling and training the model.\n\nThe most important files are:\n\n - [training/json_classification_model.py](https://github.com/bsantanna/iban-validator-model/blob/main/modules/training/json_classification_model.py) corresponds to the code used to declare, compile and train the model.\n - [training/data/country_validation_json.csv](https://github.com/bsantanna/iban-validator-model/blob/main/modules/training/data/country_validation_json.csv) corresponds to training input dataset.\n - training/data/json_classification_model/... corresponds to trained model serialized and stored in binary file format for further reuse.\n\nThe training process can be performed using the following command (from modules/training directory):\n\n```bash\n$ python3 json_classification_model.py\n```\n\n### Prediction\n\nPrediction module contains system integration and service interface declaration. \n\n![Prediction module diagram](doc/assets/prediction.png)\n\n#### gRPC Service\n\nThe gRPC service serves as main integration point as it creates a Remote Procedure Call pointing to trained model Prediction.\n\n![gRPC Service](doc/assets/grpc_flow_diagram.png)\n\nThe gRPC server process can be started using the following command (from modules/prediction/service directory):\n\n```bash\n$ python3 json_classification_service.py\n```\n\nIf all dependencies and pre-conditions are met, the model should be loaded into memory and gRPC server should start listening for requests on port `41151`\n\nThe service implements the server side of the following gRPC / Protobuf contract:\n\n```protobuf\nsyntax = \"proto3\";\n\nservice JSONClassificationService {\n  rpc getPrediction(InputFeatures) returns (OutputLabel) {}\n}\n\nmessage InputFeatures {\n  string iban = 1;\n}\n\nmessage OutputLabel {\n  string json = 1;\n}\n```\n\n - Contract publishes a service interface `getPrediction` which receives an `InputFeatures` object and returns an `OutputLabel` object.\n - `InputFeatures` object contains a single attribute `iban` with type string.\n - `OutputLabel` object contains a single attribute `json` with type string.\n - The implementation used [official gRPC documentation](https://www.grpc.io/docs/languages/python/quickstart/) as reference.  \n\n#### REST API\n\nA HTTP REST API is another deliverable of this project, it was created to simulate system integration scenario with gRPC client for the same contract served by the [gRPC Service](#grpc-service).\n\nThe following endpoints are available:\n \n - GET `/json-prediction` : Returns raw json classification predicted by model without regex validation \n - GET `/validation` : Returns validation and embedded predicted classification\n \nBoth endpoints accept a single query parameter `iban`\n\nAssuming a Java Development Kit is available, there is a maven project under `modules/prediction/rest-api`, which can be built and executed using the following commands:\n\n```bash\n$ cd modules/prediction/rest-api\n$ mvn clean install\n$ java -jar target/rest-api.jar\n```\n\nSee [usage example](#usage-example) for a quick reference.\n\n\n\n## Conclusion and references\n\nThe project reached its original goal of designing and implementing an Artificial Neural Network for validating IBAN account numbers.\n\nThe following items can be considered project deliverables:\n \n - A [IBAN / Country Regular Expression Dataset](https://www.kaggle.com/datasets/brunosantanna/iban-country-regex)\n - Tensorflow / Keras [model implementation](#trained-model).\n - A [notebook at Kaggle](https://www.kaggle.com/code/brunosantanna/iban-validator-model/notebook).\n - Docker Image [bsantanna/iban-validator-model](https://hub.docker.com/r/bsantanna/iban-validator-model) (x86_64 architecture only)\n - [REST API](#rest-api) endpoint\n - [gRPC service](#grpc-service) endpoint\n\nAs a possible future improvement, multiple models could be produced to move part of the algorithm which performs validation from runtime to model compilation time.\n\nAs a closing note, the following resources served as references for this project:\n\n - [https://www.swift.com/standards/data-standards/iban-international-bank-account-number](https://www.swift.com/standards/data-standards/iban-international-bank-account-number) \n - [http://toms-cafe.de/iban/iban.html](http://toms-cafe.de/iban/iban.html)\n - [https://github.com/open-ibans/ibans-python](https://github.com/open-ibans/ibans-python)\n - [https://keras.io/examples/structured_data/structured_data_classification_from_scratch/](https://keras.io/examples/structured_data/structured_data_classification_from_scratch/)\n - [https://keras.io/guides/preprocessing_layers/](https://keras.io/guides/preprocessing_layers/)\n\nCopyright 2022 [Bruno César Brito Sant’Anna](https://www.linkedin.com/in/brnsantanna/)\n\n\n\n## Changelog\n\nChange log is organized in chronological reverse order.\n\n### 2023-07\n\n- Used model as study case for [DP-100 exam](https://learn.microsoft.com/en-us/certifications/exams/dp-100/)\n  - [Azure ML Notebooks](notebooks/azure-ml) \n\n### 2022-09\n\n- Creation of models using Python / Jupyter and TensorFlow / Keras.\n  - Created country prediction model based on structured data.\n  - Created of JSON prediction model as an enhanced version of country model.\n- Python gRPC and Java REST APIs.\n- Docker image distribution, with Apple M1 Tensorflow port\n- README documentation\n- Kaggle Notebook\n\n### 2022-08\n\n- Project started as a personal skill set development exercise.\n\n\n\n## License\n\nDistributed under the [Apache License 2.0](https://apache.org/licenses/LICENSE-2.0). See [LICENSE](LICENSE) for more\ninformation.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbsantanna%2Fiban-validator-model","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbsantanna%2Fiban-validator-model","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbsantanna%2Fiban-validator-model/lists"}