https://github.com/gravitee-io/gravitee-inference
A set of libraries to integrate ML/AI in gravitee projects
https://github.com/gravitee-io/gravitee-inference
ai inference machine-learning security-scan
Last synced: 2 months ago
JSON representation
A set of libraries to integrate ML/AI in gravitee projects
- Host: GitHub
- URL: https://github.com/gravitee-io/gravitee-inference
- Owner: gravitee-io
- License: apache-2.0
- Created: 2025-04-16T14:55:45.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2026-04-10T14:58:00.000Z (3 months ago)
- Last Synced: 2026-04-10T16:29:40.211Z (3 months ago)
- Topics: ai, inference, machine-learning, security-scan
- Language: Java
- Homepage:
- Size: 148 KB
- Stars: 2
- Watchers: 9
- Forks: 0
- Open Issues: 12
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE.txt
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
README
# gravitee-inference
**gravitee-inference** is a Java library designed to make it easy for engineering teams to integrate and deploy AI models within the Gravitee platform—without needing specialized help from AI/ML teams.
---
## Requirements
- Java 21
- Maven (`mvn`)
---
## Import libraries
In your `pom.xml` add the dependencies
```xml
io.gravitee.inference.math.native
gravitee-inference-math-native
${gravitee.inference.version}
io.gravitee.inference.api
gravitee-inference-api
${gravitee.inference.version}
io.gravitee.inference.onnx
gravitee-inference-onnx
```
## Supported AI Models
### BERT (via ONNX)
We support BERT architecture in ONNX format for various NLP tasks:
- Sequence Classification
- Token Classification
- Fill-mask
- Vector Embedding (e.g., Sentence Similarity)
---
### 🧠 Sequence Classification
Use this to determine sentiment or categorize full sentences.
```java
var resource = new OnnxBertResource(
Paths.get("/path/to/your/model.onnx"),
Paths.get("/path/to/your/tokenizer.json")
);
var configuration = Map.of(
CLASSIFIER_MODE, ClassifierMode.SEQUENCE,
CLASSIFIER_LABELS, List.of("Negative", "Positive")
);
var onnxConfig = new OnnxBertConfig(
resource,
NativeMath.INSTANCE,
configuration
);
var model = new OnnxBertClassifierModel(onnxConfig);
// Single sentence
List results = model.infer("I am so happy!").results();
results.forEach(result -> {
System.out.println("Label: " + result.label());
System.out.println("Score: " + result.score());
});
// Multiple sentences
model.infer(List.of("I am so happy!", "I am so sad!"));
```
> Try this with [`distilbert-base-uncased-finetuned-sst-2-english`](https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
---
### 🧾 Token Classification
Use this to extract structured entities like names, locations, and organizations from text.
```java
var resource = new OnnxBertResource(
Paths.get("/path/to/your/model.onnx"),
Paths.get("/path/to/your/tokenizer.json")
);
var configuration = Map.of(
Constants.CLASSIFIER_MODE, ClassifierMode.TOKEN,
Constants.CLASSIFIER_LABELS, List.of(
"O", "B-MISC", "I-MISC", "B-PER", "I-PER", "B-ORG", "I-ORG", "B-LOC", "I-LOC"
),
Constants.DISCARD_LABELS, List.of("O", "B-MISC", "I-MISC")
);
var onnxConfig = new OnnxBertConfig(resource, NativeMath.INSTANCE, configuration);
var model = new OnnxBertClassifierModel(onnxConfig);
List results = model.infer("My name is Laura and I live in Houston, Texas").results();
results.forEach(result -> {
System.out.println("Label: " + result.label());
System.out.println("Score: " + result.score());
System.out.println("Begin: " + result.begin());
System.out.println("End: " + result.end());
});
```
```java
model.infer(List.of(
"My name is Laura and I live in Houston, Texas",
"My name is Clara and I live in Berkley, California"
));
```
> Try this with [`dslim/bert-base-NER`](https://huggingface.co/dslim/bert-base-NER/).
---
### 🎭 Fill Mask
Predict masked tokens in a sentence.
```java
var resource = new OnnxBertResource(
Paths.get("/path/to/your/model.onnx"),
Paths.get("/path/to/your/tokenizer.json")
);
var onnxConfig = new OnnxBertConfig(resource, NativeMath.INSTANCE, Map.of());
var model = new OnnxBertFillMaskInference(onnxConfig);
List results = model.infer("The capital of France is [MASK].");
System.out.println(results.getFirst().label()); // Paris
```
```java
model.infer(List.of(
"The capital of France is [MASK].",
"The capital of [MASK] is London."
));
```
> Try this with [`google-bert/bert-base-uncased`](https://huggingface.co/google-bert/bert-base-uncased).
---
### 📐 Vector Embeddings
Convert text into dense vector representations for similarity search or indexing.
```java
var resource = new OnnxBertResource(
Paths.get("/path/to/your/model.onnx"),
Paths.get("/path/to/your/tokenizer.json")
);
var onnxConfig = new OnnxBertConfig(resource, NativeMath.INSTANCE, Map.of(
POOLING_MODE, PoolingMode.MEAN,
Constants.MAX_SEQUENCE_LENGTH, 512
));
var model = new OnnxBertEmbeddingModel(onnxConfig);
EmbeddingTokenCount embedding = model.infer("The big brown fox jumped over the lazy dog");
System.out.println(embedding.embedding().length); // 384
System.out.println(embedding.tokenCount()); // 11
// Similarity comparison
EmbeddingTokenCount embedding1 = model.infer("The big brown fox jumped over the lazy dog");
EmbeddingTokenCount embedding2 = model.infer("The brown fox jumped over the dog");
System.out.println(
onnxConfig.gioMaths().cosineScore(embedding1.embedding(), embedding2.embedding())
);
```
> Try this with [`Xenova/all-MiniLM-L6-v2`](https://huggingface.co/Xenova/all-MiniLM-L6-v2).
---
### ⚡ SIMD Capabilities
To run with SIMD math acceleration:
1. Add the following to your JVM arguments:
```sh
--add-modules jdk.incubator.vector
```
2. Import the according dependencies:
```xml
io.gravitee.inference.math.simd
gravitee-inference-math-simd
${gravitee.inference.version}
```
```java
import io.gravitee.inference.math.simd.factory.SIMDMathFactory;
GioMaths maths = SIMDMathFactory.gioMaths();
```
The factory will resolve at runtime which SIMD capability your CPU handles.