{"id":44722698,"url":"https://github.com/inference4j/inference4j","last_synced_at":"2026-03-04T16:03:14.509Z","repository":{"id":338145370,"uuid":"1155794452","full_name":"inference4j/inference4j","owner":"inference4j","description":"Java Inference API for Onnx models","archived":false,"fork":false,"pushed_at":"2026-03-01T18:47:45.000Z","size":6316,"stargazers_count":25,"open_issues_count":3,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-03-01T19:46:51.226Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/inference4j.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-11T23:01:00.000Z","updated_at":"2026-03-01T18:47:32.000Z","dependencies_parsed_at":"2026-02-18T04:03:24.062Z","dependency_job_id":null,"html_url":"https://github.com/inference4j/inference4j","commit_stats":null,"previous_names":["inference4j/inference4j"],"tags_count":13,"template":false,"template_full_name":null,"purl":"pkg:github/inference4j/inference4j","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/inference4j%2Finference4j","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/inference4j%2Finference4j/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/inference4j%2Finference4j/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/inference4j%2Finference4j/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/inference4j","download_url":"https://codeload.github.com/inference4j/inference4j/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/inference4j%2Finference4j/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30085832,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-04T15:40:14.053Z","status":"ssl_error","status_checked_at":"2026-03-04T15:40:13.655Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-02-15T16:18:57.672Z","updated_at":"2026-03-04T16:03:14.499Z","avatar_url":"https://github.com/inference4j.png","language":"Java","funding_links":[],"categories":["人工智能"],"sub_categories":["Spring Cloud框架"],"readme":"# inference4j\n\n[![CI](https://github.com/inference4j/inference4j/actions/workflows/ci.yml/badge.svg)](https://github.com/inference4j/inference4j/actions/workflows/ci.yml)\n[![codecov](https://codecov.io/gh/inference4j/inference4j/graph/badge.svg)](https://codecov.io/gh/inference4j/inference4j)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)\n[![Docs](https://img.shields.io/badge/docs-inference4j.github.io-7c4dff.svg)](https://inference4j.github.io/inference4j)\n\n**Run AI models in Java. Three lines of code, zero setup.**\n\ninference4j is an inference-only AI library for Java built on ONNX Runtime. It provides ergonomic, type-safe APIs for running model inference **locally** — no API keys, no network calls, no third-party services. Pass a `String`, `BufferedImage`, or `Path`, get Java objects back.\n\n\u003e **Note:** inference4j is under active development (0.x). APIs may change. Check out the [documentation site](https://inference4j.github.io/inference4j) for the full user guide, or browse the [examples](inference4j-examples/README.md) to get started.\n\n## What can you do with inference4j?\n\nWant to see it in action? Check out [inference4j-showcase](https://github.com/inference4j/inference4j-showcase) — a local demo app you can run to explore every capability the library provides.\n\n### Sentiment Analysis\n\n```java\ntry (var classifier = DistilBertTextClassifier.builder().build()) {\n    System.out.println(classifier.classify(\"This movie was fantastic!\"));\n    // [TextClassification[label=POSITIVE, confidence=0.9998]]\n}\n```\n\n### Text Embeddings \u0026 Semantic Search\n\n```java\ntry (var embedder = SentenceTransformerEmbedder.builder()\n        .modelId(\"inference4j/all-MiniLM-L6-v2\").build()) {\n    float[] embedding = embedder.encode(\"Hello, world!\");\n}\n```\n\n### Image Classification\n\n```java\ntry (var classifier = ResNetClassifier.builder().build()) {\n    List\u003cClassification\u003e results = classifier.classify(Path.of(\"cat.jpg\"));\n    // [Classification[label=tabby cat, confidence=0.87], ...]\n}\n```\n\n### Object Detection\n\n```java\ntry (var detector = YoloV8Detector.builder().build()) {\n    List\u003cDetection\u003e detections = detector.detect(Path.of(\"street.jpg\"));\n    // [Detection[label=car, confidence=0.94, box=BoundingBox[...]], ...]\n}\n```\n\n### Speech-to-Text\n\n```java\ntry (var recognizer = Wav2Vec2Recognizer.builder().build()) {\n    System.out.println(recognizer.transcribe(Path.of(\"audio.wav\")).text());\n}\n```\n\n### Voice Activity Detection\n\n```java\ntry (var vad = SileroVadDetector.builder().build()) {\n    List\u003cVoiceSegment\u003e segments = vad.detect(Path.of(\"meeting.wav\"));\n    // [VoiceSegment[start=0.50, end=3.20], VoiceSegment[start=5.10, end=8.75]]\n}\n```\n\n### Text Detection\n\n```java\ntry (var detector = CraftTextDetector.builder().build()) {\n    List\u003cTextRegion\u003e regions = detector.detect(Path.of(\"document.jpg\"));\n}\n```\n\n### Zero-Shot Image Classification\n\n```java\ntry (var classifier = ClipClassifier.builder().build()) {\n    List\u003cClassification\u003e results = classifier.classify(\n            Path.of(\"photo.jpg\"), List.of(\"cat\", \"dog\", \"bird\", \"car\"));\n    // [Classification[label=cat, confidence=0.82], ...]\n}\n```\n\n### Search Reranking\n\n```java\ntry (var reranker = MiniLMSearchReranker.builder().build()) {\n    float score = reranker.score(\"What is Java?\", \"Java is a programming language.\");\n}\n```\n\n### Text Generation\n\n```java\ntry (var gen = OnnxTextGenerator.qwen2()\n        .maxNewTokens(50).temperature(0.8f).topK(50).build()) {\n    gen.generate(\"Explain gravity\", token -\u003e System.out.print(token));\n}\n```\n\n## Getting Started\n\n**Requirements:** Java 17+\n\n### Add the dependency\n\n`inference4j-core` is the only dependency you need — it includes all model wrappers, tokenizers, and preprocessing.\n\n**Gradle**\n\n```groovy\nimplementation 'io.github.inference4j:inference4j-core:${inference4jVersion}'\n```\n\n**Maven**\n\n```xml\n\u003cdependency\u003e\n    \u003cgroupId\u003eio.github.inference4j\u003c/groupId\u003e\n    \u003cartifactId\u003einference4j-core\u003c/artifactId\u003e\n    \u003cversion\u003e${inference4jVersion}\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n\u003e **JVM flag:** ONNX Runtime requires native access. Add `--enable-native-access=ALL-UNNAMED` to your JVM arguments, or use `--enable-native-access=com.microsoft.onnxruntime` if you're on the module path.\n\n### Run your first model\n\n```java\nimport io.github.inference4j.nlp.DistilBertTextClassifier;\n\npublic class QuickStart {\n    public static void main(String[] args) {\n        try (var classifier = DistilBertTextClassifier.builder().build()) {\n            System.out.println(classifier.classify(\"inference4j makes AI in Java easy!\"));\n            // [TextClassification[label=POSITIVE, confidence=0.9998]]\n        }\n    }\n}\n```\n\nThat's it. The model downloads automatically on first run (~260MB, cached in `~/.cache/inference4j/`). No Python, no manual downloads, no tensor wrangling.\n\n## What you don't have to do\n\n- **No tokenization** — WordPiece and BPE tokenizers are built in and handled automatically\n- **No tensor handling** — pass a `String`, `BufferedImage`, or `Path`; get Java objects back\n- **No ONNX session setup** — `builder().build()` handles everything\n- **No model downloads** — auto-downloaded from HuggingFace and cached on first use\n- **No Python sidecar** — pure Java, runs anywhere Java runs\n\n## vs raw ONNX Runtime\n\n\u003ctable\u003e\n\u003ctr\u003e\n\u003cth\u003eWithout inference4j\u003c/th\u003e\n\u003cth\u003eWith inference4j\u003c/th\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\n\n```java\nOrtEnvironment env = OrtEnvironment.getEnvironment();\nOrtSession session = env.createSession(\"resnet50.onnx\");\n\nBufferedImage img = ImageIO.read(new File(\"cat.jpg\"));\nBufferedImage resized = resize(img, 224, 224);\nfloat[] pixels = new float[3 * 224 * 224];\nfor (int c = 0; c \u003c 3; c++)\n  for (int y = 0; y \u003c 224; y++)\n    for (int x = 0; x \u003c 224; x++) {\n      int rgb = resized.getRGB(x, y);\n      float val = ((rgb \u003e\u003e (16 - c * 8)) \u0026 0xFF) / 255f;\n      pixels[c * 224 * 224 + y * 224 + x] =\n        (val - MEAN[c]) / STD[c];\n    }\n\nOnnxTensor tensor = OnnxTensor.createTensor(env,\n    FloatBuffer.wrap(pixels), new long[]{1, 3, 224, 224});\nOrtSession.Result result = session.run(\n    Map.of(\"data\", tensor));\nfloat[] logits = ((float[][]) result.get(0)\n    .getValue())[0];\nfloat[] probs = softmax(logits);\nint bestIdx = argmax(probs);\nString label = LABELS[bestIdx];\n// ~30 lines, manual everything\n```\n\n\u003c/td\u003e\n\u003ctd\u003e\n\n```java\ntry (var classifier = ResNetClassifier.builder().build()) {\n    var results = classifier.classify(\n        Path.of(\"cat.jpg\")\n    );\n    // done.\n}\n\n// 3 lines.\n// Auto-downloads model.\n// Handles preprocessing.\n// Returns Java objects.\n```\n\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n## Why inference4j?\n\nJava has great tools for building AI-powered applications. [Spring AI](https://spring.io/projects/spring-ai) provides an excellent abstraction layer for LLM orchestration. [DJL](https://djl.ai/) offers engine-agnostic model training and inference. [LangChain4j](https://docs.langchain4j.dev/) simplifies LLM-powered workflows.\n\n**inference4j doesn't compete with any of them.** It fills a different gap.\n\nWhen you need to run a specific ONNX model — an embedding model, an object detector, a speech-to-text model — you currently face a choice: drop down to the raw ONNX Runtime Java bindings and deal with `Map\u003cString, OnnxTensor\u003e` manually, or pull in a heavyweight framework that does far more than you need.\n\ninference4j sits in the sweet spot:\n\n- **3-line integration** for popular models — `builder().build()`, call a method, get Java objects back\n- **Standard Java types** in, standard Java types out — no tensor abstractions leak into your code\n- **Inference only** — optimized for production serving, not training\n- **Lightweight** — each wrapper is a thin layer over ONNX Runtime, not a framework\n- **Complements the ecosystem** — use inference4j to run your embedding model, Spring AI to orchestrate your LLM chain, both in the same application\n\nWe believe the Java AI ecosystem is stronger when tools do one thing well. inference4j does local model inference, and tries to do it really well.\n\n## Supported Models\n\n### NLP\n\n| Capability | Models | API |\n|---|---|---|\n| Classification | DistilBERT, BERT | `TextClassifier` |\n| Embeddings | all-MiniLM, all-mpnet | `TextEmbedder` |\n| Reranking | ms-marco-MiniLM | `SearchReranker` |\n| Text Generation | GPT-2, SmolLM2, Qwen2.5 | `TextGenerator` |\n\n### Vision\n\n| Capability | Models | API |\n|---|---|---|\n| Classification | ResNet, EfficientNet | `ImageClassifier` |\n| Object Detection | YOLOv8, YOLO11, YOLO26 | `ObjectDetector` |\n| Text Detection | CRAFT | `TextDetector` |\n\n### Multimodal\n\n| Capability | Models | API |\n|---|---|---|\n| Zero-Shot Classification | CLIP | `ZeroShotClassifier` |\n| Image Embeddings | CLIP | `ImageEmbedder` |\n| Text Embeddings | CLIP | `TextEmbedder` |\n\n### Audio\n\n| Capability | Models | API |\n|---|---|---|\n| Recognition | Wav2Vec2 | `SpeechRecognizer` |\n| Voice Activity Detection | Silero VAD | `VoiceActivityDetector` |\n\n### Generative AI (onnxruntime-genai)\n\n| Capability | Models | API |\n|---|---|---|\n| Text Generation | Phi-3, DeepSeek-R1 | `TextGenerator` |\n| Vision-Language | Phi-3.5 Vision | `VisionLanguageModel` |\n| Speech-to-Text | Whisper | `WhisperSpeechModel` |\n\n\u003e **Auto-download:** All supported models are hosted under the [`inference4j`](https://huggingface.co/inference4j) HuggingFace organization. Models are automatically downloaded and cached on first use — no manual setup required. Cache location defaults to `~/.cache/inference4j/` and can be customized via `INFERENCE4J_CACHE_DIR` or `-Dinference4j.cache.dir`.\n\n## Hardware Acceleration\n\ninference4j supports GPU and hardware acceleration out of the box via ONNX Runtime execution providers. On macOS, CoreML is bundled in the standard dependency — just add one line:\n\n```java\ntry (var classifier = ResNetClassifier.builder()\n        .sessionOptions(opts -\u003e opts.addCoreML())\n        .build()) {\n    classifier.classify(Path.of(\"cat.jpg\"));\n}\n```\n\nFor CUDA (Linux/Windows), swap the Maven dependency from `onnxruntime` to `onnxruntime_gpu`:\n\n```java\ntry (var classifier = ResNetClassifier.builder()\n        .sessionOptions(opts -\u003e opts.addCUDA(0))\n        .build()) {\n    classifier.classify(Path.of(\"cat.jpg\"));\n}\n```\n\nThe `.sessionOptions()` API is available on every model wrapper.\n\n### Benchmarks on Apple Silicon (M-series)\n\n| Model | Capability | CPU | CoreML | Speedup |\n|-------|------|-----|--------|---------|\n| ResNet-50 | Image Classification | 37 ms | 10 ms | **3.7x** |\n| CRAFT | Text Detection | 831 ms | 153 ms | **5.4x** |\n\n\u003e Measured with 3 warmup runs + 10 timed runs. See the benchmark examples for [ResNet](inference4j-examples/src/main/java/io/github/inference4j/examples/ResNetAccelerationBenchmarkExample.java) and [CRAFT](inference4j-examples/src/main/java/io/github/inference4j/examples/CraftAccelerationBenchmarkExample.java).\n\n## Spring Boot\n\nAdd the starter and enable the models you need:\n\n```groovy\nimplementation 'io.github.inference4j:inference4j-spring-boot-starter:${inference4jVersion}'\n```\n\n```yaml\ninference4j:\n  nlp:\n    text-classifier:\n      enabled: true\n```\n\n```java\n@RestController\npublic class SentimentController {\n    private final TextClassifier classifier;\n\n    public SentimentController(TextClassifier classifier) {\n        this.classifier = classifier;\n    }\n\n    @PostMapping(\"/analyze\")\n    public List\u003cTextClassification\u003e analyze(@RequestBody String text) {\n        return classifier.classify(text);\n    }\n}\n```\n\nEvery model is opt-in — nothing is downloaded until you set `enabled: true`. Beans are interface-typed, so you can swap implementations with `@ConditionalOnMissingBean`. An actuator health indicator is included out of the box. See the [full documentation](https://github.com/inference4j/inference4j/wiki) for all available properties.\n\n## Roadmap\n\nSee the [Roadmap](https://inference4j.github.io/inference4j/roadmap/) for details and what's coming next.\n\n## Project Structure\n\n| Module | Description |\n|--------|-------------|\n| `inference4j-core` | Model wrappers, tokenizers, preprocessing, ONNX Runtime abstractions, native generation engine |\n| `inference4j-genai` | Generative AI via [onnxruntime-genai](https://github.com/microsoft/onnxruntime-genai) — Phi-3, DeepSeek-R1, Whisper, Phi-3.5 Vision |\n| `inference4j-runtime` | Operational layer — model routing, A/B testing, Micrometer metrics |\n| `inference4j-spring-boot-starter` | Spring Boot auto-configuration, health indicators |\n| `inference4j-examples` | Runnable examples ([see README](inference4j-examples/README.md)) |\n\n## Build\n\nRequires **Java 17**.\n\n```bash\n./gradlew build          # Build all modules and run tests\n./gradlew test           # Run tests only\n```\n\n## Built with Claude Code\n\nThis project was built collaboratively with [Claude Code](https://claude.ai/code).\n\nThe humans drove architecture, API design, interface contracts, and model selection. Claude Code wrote the implementation — the wrappers, the preprocessing pipelines, the math utilities, the tests, and the documentation you're reading right now.\n\nWithout Claude Code, this project would have taken weeks instead of hours. We want to embrace this new reality of software development, keep pushing forward, and — with community feedback and contributions — give something useful back to the Java ecosystem.\n\nWe think this is worth being transparent about. Agentic development is how software gets built now, and pretending otherwise helps no one. The design decisions are ours; the execution was a partnership.\n\n## Contributing\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.\n\n## License\n\n[Apache License 2.0](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finference4j%2Finference4j","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Finference4j%2Finference4j","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finference4j%2Finference4j/lists"}