https://github.com/heardacat/iris-h2o-lambda

example of using h2o + lambda to deploy a solution
https://github.com/heardacat/iris-h2o-lambda

Last synced: 2 months ago
JSON representation

example of using h2o + lambda to deploy a solution

Host: GitHub
URL: https://github.com/heardacat/iris-h2o-lambda
Owner: HeardACat
License: mit
Created: 2017-10-19T09:50:08.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2017-10-23T21:12:18.000Z (over 7 years ago)
Last Synced: 2025-03-26T16:40:35.922Z (3 months ago)
Language: Java
Size: 66.4 KB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Readme

Ensure that `JAVA_HOME` is set to run `gradle` correctly.

```
gradlew wrapper
gradlew buildZip
gradlew buildDocker
```

Usage as Java Object
--------------------

```
gradlew run
```

This will verify that the main method can be called and therefore can be integrated into any Java program as required. The main class should always be in the form:

```java
RequestClass request = new RequestClass(...);
prediction(request);
```

Testing using `docker-lambda`
-----------------------------

Run `buildDocker` so that we can run using `docker-lambda`

```
set PWD=
docker run -v "%PWD%/build/docker":/var/task lambci/lambda:java8 ModelScorer::handler "{\"c0\": 5.1, \"c1\": 3.5, \"c2\": 1.4, \"c3\": 0.2}"
```

Framework
---------

Realtime scoring using AWS Lambda. This allows building simple pipelines when constrained to basic Python operations only. The source code is as follows:

* `src/main/java/ModelScorer.java`: handles the payload from AWS. In the ideal world, this is generally untouched - the data scientist would only touch the `feature_preprocessing` and `decision_engine` scripts.
* `src/main/java/irisModel.java`: the automatically generated POJO file from H2O. This file has been generated off the Jupyter notebook in this repository
* `src/main/resources/pipeline.py`: this executes the scoring pipeline - should generally be untouched
* `src/main/resources/feature_preprocessing.py`: this file is for any feature preprocessing which was completed before being scored by h2o
* `src/main/resources/decision_engine.py`: this file is for modifying the output of the scoring object before it is sent to the outbound payload. This could include threshold adjustments, alert text, rules override etc.

Proposed Workflow
-----------------

If there is no change the payload interface, the process for updating models should be:

* Modify `feature_preprocessing`, `decision_engine` functions as needed. Rely on automatically generated POJO files for the H2O modelling portion
* Package up the solution using `gradlew buildDocker` and test using `docker-lambda` (with a suite of tests?)
* If passes all relevant tests, bundle and deploy using `gradlew buildZip`

![screenshot](screenshot-iris-h2o-lambda.png)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/heardacat/iris-h2o-lambda

Awesome Lists containing this project

README