https://github.com/gravitee-io/gravitee-inference

A set of libraries to integrate ML/AI in gravitee projects
https://github.com/gravitee-io/gravitee-inference

ai inference machine-learning security-scan

Last synced: 2 months ago
JSON representation

A set of libraries to integrate ML/AI in gravitee projects

Host: GitHub
URL: https://github.com/gravitee-io/gravitee-inference
Owner: gravitee-io
License: apache-2.0
Created: 2025-04-16T14:55:45.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2026-04-10T14:58:00.000Z (3 months ago)
Last Synced: 2026-04-10T16:29:40.211Z (3 months ago)
Topics: ai, inference, machine-learning, security-scan
Language: Java
Homepage:
Size: 148 KB
Stars: 2
Watchers: 9
Forks: 0
Open Issues: 12
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE.txt
- Codeowners: .github/CODEOWNERS

Awesome Lists containing this project

README

          # gravitee-inference

**gravitee-inference** is a Java library designed to make it easy for engineering teams to integrate and deploy AI models within the Gravitee platform—without needing specialized help from AI/ML teams.

---

## Requirements

- Java 21

- Maven (`mvn`)

---

## Import libraries

In your `pom.xml` add the dependencies

```xml

  io.gravitee.inference.math.native

  gravitee-inference-math-native

  ${gravitee.inference.version}

  io.gravitee.inference.api

  gravitee-inference-api

  ${gravitee.inference.version}

  io.gravitee.inference.onnx

  gravitee-inference-onnx

  

```

## Supported AI Models

### BERT (via ONNX)

We support BERT architecture in ONNX format for various NLP tasks:

- Sequence Classification

- Token Classification

- Fill-mask

- Vector Embedding (e.g., Sentence Similarity)

---

### 🧠 Sequence Classification

Use this to determine sentiment or categorize full sentences.

```java

var resource = new OnnxBertResource(

    Paths.get("/path/to/your/model.onnx"),

    Paths.get("/path/to/your/tokenizer.json")

);

var configuration = Map.of(

    CLASSIFIER_MODE, ClassifierMode.SEQUENCE,

    CLASSIFIER_LABELS, List.of("Negative", "Positive")

);

var onnxConfig = new OnnxBertConfig(

    resource,

    NativeMath.INSTANCE,

    configuration

);

var model = new OnnxBertClassifierModel(onnxConfig);

// Single sentence

List results = model.infer("I am so happy!").results();

results.forEach(result -> {

    System.out.println("Label: " + result.label());

    System.out.println("Score: " + result.score());

});

// Multiple sentences

model.infer(List.of("I am so happy!", "I am so sad!"));

```

> Try this with [`distilbert-base-uncased-finetuned-sst-2-english`](https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).

---

### 🧾 Token Classification

Use this to extract structured entities like names, locations, and organizations from text.

```java

var resource = new OnnxBertResource(

    Paths.get("/path/to/your/model.onnx"),

    Paths.get("/path/to/your/tokenizer.json")

);

var configuration = Map.of(

    Constants.CLASSIFIER_MODE, ClassifierMode.TOKEN,

    Constants.CLASSIFIER_LABELS, List.of(

        "O", "B-MISC", "I-MISC", "B-PER", "I-PER", "B-ORG", "I-ORG", "B-LOC", "I-LOC"

    ),

    Constants.DISCARD_LABELS, List.of("O", "B-MISC", "I-MISC")

);

var onnxConfig = new OnnxBertConfig(resource, NativeMath.INSTANCE, configuration);

var model = new OnnxBertClassifierModel(onnxConfig);

List results = model.infer("My name is Laura and I live in Houston, Texas").results();

results.forEach(result -> {

    System.out.println("Label: " + result.label());

    System.out.println("Score: " + result.score());

    System.out.println("Begin: " + result.begin());

    System.out.println("End: " + result.end());

});

```

```java

model.infer(List.of(

    "My name is Laura and I live in Houston, Texas",

    "My name is Clara and I live in Berkley, California"

));

```

> Try this with [`dslim/bert-base-NER`](https://huggingface.co/dslim/bert-base-NER/).

---

### 🎭 Fill Mask

Predict masked tokens in a sentence.

```java

var resource = new OnnxBertResource(

    Paths.get("/path/to/your/model.onnx"),

    Paths.get("/path/to/your/tokenizer.json")

);

var onnxConfig = new OnnxBertConfig(resource, NativeMath.INSTANCE, Map.of());

var model = new OnnxBertFillMaskInference(onnxConfig);

List results = model.infer("The capital of France is [MASK].");

System.out.println(results.getFirst().label()); // Paris

```

```java

model.infer(List.of(

    "The capital of France is [MASK].",

    "The capital of [MASK] is London."

));

```

> Try this with [`google-bert/bert-base-uncased`](https://huggingface.co/google-bert/bert-base-uncased).

---

### 📐 Vector Embeddings

Convert text into dense vector representations for similarity search or indexing.

```java

var resource = new OnnxBertResource(

    Paths.get("/path/to/your/model.onnx"),

    Paths.get("/path/to/your/tokenizer.json")

);

var onnxConfig = new OnnxBertConfig(resource, NativeMath.INSTANCE, Map.of(

    POOLING_MODE, PoolingMode.MEAN,

    Constants.MAX_SEQUENCE_LENGTH, 512

));

var model = new OnnxBertEmbeddingModel(onnxConfig);

EmbeddingTokenCount embedding = model.infer("The big brown fox jumped over the lazy dog");

System.out.println(embedding.embedding().length); // 384

System.out.println(embedding.tokenCount()); // 11

// Similarity comparison

EmbeddingTokenCount embedding1 = model.infer("The big brown fox jumped over the lazy dog");

EmbeddingTokenCount embedding2 = model.infer("The brown fox jumped over the dog");

System.out.println(

    onnxConfig.gioMaths().cosineScore(embedding1.embedding(), embedding2.embedding())

);

```

> Try this with [`Xenova/all-MiniLM-L6-v2`](https://huggingface.co/Xenova/all-MiniLM-L6-v2).

---

### ⚡ SIMD Capabilities

To run with SIMD math acceleration:

1. Add the following to your JVM arguments:

```sh

--add-modules jdk.incubator.vector

```

2. Import the according dependencies:

```xml

    io.gravitee.inference.math.simd

    gravitee-inference-math-simd

    ${gravitee.inference.version}

```

```java

import io.gravitee.inference.math.simd.factory.SIMDMathFactory;

GioMaths maths = SIMDMathFactory.gioMaths();

```

The factory will resolve at runtime which SIMD capability your CPU handles.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/gravitee-io/gravitee-inference

Awesome Lists containing this project

README