https://github.com/heardacat/iris-h2o-lambda
example of using h2o + lambda to deploy a solution
https://github.com/heardacat/iris-h2o-lambda
Last synced: 2 months ago
JSON representation
example of using h2o + lambda to deploy a solution
- Host: GitHub
- URL: https://github.com/heardacat/iris-h2o-lambda
- Owner: HeardACat
- License: mit
- Created: 2017-10-19T09:50:08.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2017-10-23T21:12:18.000Z (over 7 years ago)
- Last Synced: 2025-03-26T16:40:35.922Z (3 months ago)
- Language: Java
- Size: 66.4 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Readme
Ensure that `JAVA_HOME` is set to run `gradle` correctly.
```
gradlew wrapper
gradlew buildZip
gradlew buildDocker
```Usage as Java Object
--------------------```
gradlew run
```This will verify that the main method can be called and therefore can be integrated into any Java program as required. The main class should always be in the form:
```java
RequestClass request = new RequestClass(...);
prediction(request);
```Testing using `docker-lambda`
-----------------------------Run `buildDocker` so that we can run using `docker-lambda`
```
set PWD=
docker run -v "%PWD%/build/docker":/var/task lambci/lambda:java8 ModelScorer::handler "{\"c0\": 5.1, \"c1\": 3.5, \"c2\": 1.4, \"c3\": 0.2}"
```Framework
---------Realtime scoring using AWS Lambda. This allows building simple pipelines when constrained to basic Python operations only. The source code is as follows:
* `src/main/java/ModelScorer.java`: handles the payload from AWS. In the ideal world, this is generally untouched - the data scientist would only touch the `feature_preprocessing` and `decision_engine` scripts.
* `src/main/java/irisModel.java`: the automatically generated POJO file from H2O. This file has been generated off the Jupyter notebook in this repository
* `src/main/resources/pipeline.py`: this executes the scoring pipeline - should generally be untouched
* `src/main/resources/feature_preprocessing.py`: this file is for any feature preprocessing which was completed before being scored by h2o
* `src/main/resources/decision_engine.py`: this file is for modifying the output of the scoring object before it is sent to the outbound payload. This could include threshold adjustments, alert text, rules override etc.Proposed Workflow
-----------------If there is no change the payload interface, the process for updating models should be:
* Modify `feature_preprocessing`, `decision_engine` functions as needed. Rely on automatically generated POJO files for the H2O modelling portion
* Package up the solution using `gradlew buildDocker` and test using `docker-lambda` (with a suite of tests?)
* If passes all relevant tests, bundle and deploy using `gradlew buildZip`