https://github.com/humangraphics/backend-lambda-layer

AWS Lambda layer for running ML models with SnapStart
https://github.com/humangraphics/backend-lambda-layer

aws-lambda aws-lambda-java aws-lambda-layer ml snapstart

Last synced: 10 months ago
JSON representation

AWS Lambda layer for running ML models with SnapStart

Host: GitHub
URL: https://github.com/humangraphics/backend-lambda-layer
Owner: humangraphics
Created: 2023-10-18T17:01:58.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2024-10-02T20:05:40.000Z (over 1 year ago)
Last Synced: 2025-03-20T21:44:23.234Z (11 months ago)
Topics: aws-lambda, aws-lambda-java, aws-lambda-layer, ml, snapstart
Language: Java
Homepage:
Size: 78.1 KB
Stars: 2
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# backend-lambda-layer

This is Lambda layer that allows HumanGraphics to run AI/ML models in AWS Lambda using SnapStart, as referenced in [Java AI/ML on Lambda with Human Graphics | Serverless Office Hours](https://www.youtube.com/watch?v=wy2QYgKAoEg&t=5s).

## Overview

It makes a few important preparations to the lambda runtime during initialization:

1. Use a custom `AWSLambda` implementation to load the application in the system classloader. By default, the application is loaded in a separate classloader. This is important because native libraries are only allowed to be loaded from one classloader in an application, and splitting classloaders causes problems with JNI loading.
2. Modify the classpath to include the `/var/task` folder, which is where the application is unzipped. This is important because services are only allowed to load from the classpath.
3. Load so-called "tmpdump" archives from S3. This allows applications to load arbitrary application data into the /tmp folder during initialization, which is used for native libraries in practice. This is important because native libraries for ML are frequently large, but the application can only be up to 256MB unzipped.
4. Modify the library loading paths to include `/tmp/lib`. Both `java.library.path` and `LD_LIBRARY_PATH` are customized. This is important because native libraries should be loadable from tmpdumps.

## Releasing this artifact

This artifact is automatically deployed on merge to main. Active development should happen on the `v2.4.x` (or similar) branch, and then be merged via PR. Deployment creates a Lambda layer automatically using the CloudFormation `cfn-deploy.yml` template via continuous delivery.

## Creating tmpdumps

Again, tmpdumps are typically used to house (large) native libraries. To create a tmpdump using javacpp libraries, perform the following steps:

1. Enable debug printin in javacpp using `-Dorg.bytedeco.javacpp.logger.debug=true` and run the application, being sure to exercise all code paths that load libraries.
2. Inspect the debug logs to find the libraries which are loaded and essential to the application. This command can be useful to extract loaded libraries: `cat debug.log | grep 'Loading' | grep '[.]so' | less | awk '{print $3}' | sed -e 's!^.*[.]jar/!/!;'`. This command can be useful to extract loaded classes: `cat debug.log | grep 'Loading class' | awk '{print $4}' | sort | uniq -c | sort -nr`.
3. Add the cache javacpp mojo to POM, caching the essential libraries. It may be prudent to add this to a profile.
4. Run the build using `mvn -Dorg.bytedeco.javacpp.cachedir=target/javacpp/lib -Dorg.bytedeco.javacpp.cachedir.nosubdir=true -Djavacpp.platform=linux-x86_64 clean compile install`.
5. Add the cached libraries from the x86_64 platform to a ZIP file at "lib/library.so". To save space, use symlinks for library versions, using the `zip -y` flag.
6. Upload to S3 at `$BUCKET/tmpdump/$LAMBDA_NAME.zip`.

Example instructions for refining a tmpdump:

cd target/javacpp
rm -f ../../tmpdump.zip
zip -y -r ../../tmpdump.zip lib/
cd ../..
zip -d tmpdump.zip 'lib/.lock'
zip -d tmpdump.zip 'lib/*openblas_nolapack*'

TODO: Figure out how to load AVX2 platform libs.

When releasing the artifact, these library files should be removed from the JAR. This is a good starting point for what to remove:

zip -d target/lambda-analyze-human-face-race.jar 'org/bytedeco/**/*.so*'
zip -d target/lambda-analyze-human-face-race.jar 'org/bytedeco/**/*.a'
zip -d target/lambda-analyze-human-face-race.jar 'org/bytedeco/**/*.xml'
zip -d target/lambda-analyze-human-face-race.jar 'org/bytedeco/**/*.h'
zip -d target/lambda-analyze-human-face-race.jar 'org/bytedeco/**/*.hpp'

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/humangraphics/backend-lambda-layer

Awesome Lists containing this project

README