https://github.com/machinefi/trio-core

Real-time vision intelligence engine for Apple Silicon. YOLO counting + VLM scene understanding + auto-calibration. REST API in one command.
https://github.com/machinefi/trio-core

apple-silicon computer-vision edge-ai inference mlx object-detection people-counting real-time vlm yolo

Last synced: 3 months ago
JSON representation

Real-time vision intelligence engine for Apple Silicon. YOLO counting + VLM scene understanding + auto-calibration. REST API in one command.

Host: GitHub
URL: https://github.com/machinefi/trio-core
Owner: machinefi
License: apache-2.0
Created: 2026-03-05T23:12:28.000Z (4 months ago)
Default Branch: main
Last Pushed: 2026-04-03T05:07:10.000Z (3 months ago)
Last Synced: 2026-04-03T12:58:23.701Z (3 months ago)
Topics: apple-silicon, computer-vision, edge-ai, inference, mlx, object-detection, people-counting, real-time, vlm, yolo
Language: Python
Homepage: https://pypi.org/project/trio-core/
Size: 65.2 MB
Stars: 11
Watchers: 1
Forks: 3
Open Issues: 9
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md

Awesome Lists containing this project

README

          


  
Trio Edge

  

    Open-source camera intelligence for your network

  

  

    ONVIF discovery + RTSP streaming + YOLO detection + VLM scene understanding.


    One pip install. Runs on any Mac, Linux, or Windows machine.

  




  

  

  

  





  Quick Start |

  Two Modes |

  Cloud Relay |

  Local AI |

  CLI |

  API |

  SDK |

  Architecture



---

## What is Trio Edge?

Trio Edge is the open-source camera agent for [Trio AI](https://trio.ai). It runs on your local network, discovers cameras via ONVIF, and either:

1. **Relays frames to Trio Cloud** for full AI analysis (memory, entity tracking, dashboards)

2. **Runs AI locally** with your own LLM (Claude, GPT, local Qwen) for standalone use

```

┌──────────────────────────────────────────────────────────┐

│  Your Network                                            │

│                                                          │

│  IP Camera ──RTSP──► Trio Edge ──HTTPS──► Trio Cloud     │

│                      (this repo)         (paid, $99/cam) │

│                          │                               │

│                          └── or use your own LLM         │

│                             (free, open source)          │

└──────────────────────────────────────────────────────────┘

```

**Core capabilities:**

- **Discover** — Auto-find cameras on your network via ONVIF

- **Relay** — Push RTSP frames to Trio Cloud over HTTPS (NAT-friendly)

- **Detect** — YOLO v10n object detection (people, vehicles, 80 classes)

- **Describe** — VLM scene descriptions with any LLM (local or cloud)

- **Tailscale auto-proxy** — Works through Tailscale networks automatically

---

## Quick Start

```bash

# Install

pip install 'trio-edge[mlx]'      # Apple Silicon

pip install 'trio-edge[cuda]'     # NVIDIA GPU

pip install trio-edge              # CPU-only

# Discover cameras on your network

trio discover

# Start watching a camera

trio cam --rtsp rtsp://admin:pass@192.168.1.100/stream

# Or relay to Trio Cloud

trio relay --camera rtsp://admin:pass@192.168.1.100/stream \

           --cloud https://api.trio.ai --token YOUR_TOKEN

```

---

## Two Modes

### Mode 1: Cloud Relay (Trio Cloud customers)

Trio Edge pulls RTSP or local video, registers a camera with Trio Cloud, and pushes HTTP MPEG-TS to the cloud ingest endpoint. All AI processing happens in the cloud.

```bash

trio relay --camera rtsp://admin:pass@192.168.1.100/stream \

           --cloud https://api.trio.ai \

           --token YOUR_TOKEN

```

**What Edge does:** source capture → ffmpeg MPEG-TS mux → authenticated HTTP upload

**What Cloud does:** session management → ingest → analysis → dashboard

### Mode 2: Local AI (open-source users)

Run everything locally with your own LLM. No cloud needed, no subscription.

```bash

# With local Qwen model (Apple Silicon)

trio cam --rtsp rtsp://admin:pass@192.168.1.100/stream

# With Claude API

ANTHROPIC_API_KEY=sk-xxx trio cam --rtsp rtsp://... --llm claude

# With OpenAI

OPENAI_API_KEY=sk-xxx trio cam --rtsp rtsp://... --llm gpt-4o

```

**Note:** Local mode gives you real-time detection and descriptions, but no persistent memory, entity tracking, historical analytics, or dashboard. For those features, use [Trio Cloud](https://trio.ai).

---

## Features

### ONVIF Camera Discovery

```bash

trio discover

# Found 2 camera(s):

#   [1] Reolink RLC-810A (192.168.1.100)

#       RTSP: rtsp://192.168.1.100:554/h264Preview_01_main

#   [2] Hikvision DS-2CD2143 (192.168.1.101)

#       RTSP: rtsp://192.168.1.101:554/Streaming/Channels/101

```

### Tailscale Auto-Proxy

If you use Tailscale, Trio Edge automatically detects when the macOS network extension blocks camera access and creates a transparent TCP proxy:

```

trio cam --rtsp rtsp://admin:pass@192.168.1.100/stream

# Tailscale detected — starting TCP proxy via system Python...

# Proxy: 127.0.0.1:15554 → 192.168.1.100:554

# (continues normally, user sees no difference)

```

### YOLO Object Detection

Built-in YOLOv10n (ONNX, 9MB) with tiled detection for accuracy:

```bash

trio cam --rtsp rtsp://... --count

# [14:23:46] People: 3, Vehicles: 2

# [14:24:12] People: 5, Vehicles: 2 (+2 people)

```

### VLM Scene Description

Supports multiple VLM backends:

| Backend | Command | Requirements |

|---------|---------|-------------|

| Local Qwen (MLX) | `trio cam --rtsp ...` | Apple Silicon, 4GB+ RAM |

| Claude | `trio cam --llm claude` | `ANTHROPIC_API_KEY` |

| GPT-4o | `trio cam --llm gpt-4o` | `OPENAI_API_KEY` |

| Any OpenAI-compatible | `trio cam --llm-url http://...` | API endpoint |

---

## CLI

```bash

trio discover                          # Find cameras via ONVIF

trio cam --rtsp rtsp://... --count     # Watch + count objects

trio cam --host 192.168.1.100 -p pass  # Auto-discover + connect

trio relay --camera rtsp://... --cloud https://api.trio.ai  # Cloud relay

trio serve                             # Start inference API server

trio analyze photo.jpg -q "What's here?"  # Analyze single image

trio webcam -w "person at the door"    # Webcam with alerts

trio doctor                            # Diagnose setup issues

trio device                            # Show hardware info

```

---

## API Reference

Start the local inference server:

```bash

trio serve                          # default: 0.0.0.0:8100

trio serve --port 9000              # custom port

TRIO_API_KEY=secret trio serve      # enable auth

```

### `POST /api/inference/detect`

YOLO object detection.

```bash

curl -X POST http://localhost:8100/api/inference/detect \

  -H "Content-Type: application/json" \

  -d '{"image_b64": "'$(base64 -i photo.jpg)'"}'

```

```json

{

  "people_count": 3, "vehicle_count": 1,

  "by_class": {"person": 3, "car": 1},

  "crops_b64": [{"class": "person", "bbox": [100, 50, 200, 300], "confidence": 0.92}],

  "elapsed_ms": 45

}

```

### `POST /api/inference/describe`

VLM scene description.

```bash

curl -X POST http://localhost:8100/api/inference/describe \

  -H "Content-Type: application/json" \

  -d '{"image_b64": "'$(base64 -i photo.jpg)'", "prompt": "Describe what you see."}'

```

### `POST /api/inference/crop-describe`

Combined: YOLO detects → crop → VLM describes each entity → full scene description.

---

## Python SDK

```python

from trio_edge import TrioCore, EngineConfig

engine = TrioCore()

engine.load()

result = engine.analyze_video("photo.jpg", "What do you see?")

print(result.text)

```

---

## Supported Models

### Tier 1 — Full optimization (native loading + visual token compression + KV reuse)

| Model | Params | 4-bit VRAM | Best for |

|---|---|---|---|

| Qwen3-VL-8B | 8B | ~5GB | **Recommended** — best accuracy |

| Qwen2.5-VL-3B | 3B | ~2GB | Fast, lightweight |

| Qwen3.5 | 0.8-9B | 0.5-5G | Flexible range |

| InternVL3 | 1-2B | 1-1.6G | Tiny devices |

### Tier 2 — Inference only (via mlx-vlm)

Gemma 3n, SmolVLM2, Phi-4, FastVLM, and any model supported by mlx-vlm.

---

## Architecture

```

                          Trio Edge

                              |

              +---------------+---------------+

              |                               |

         YOLO Pipeline                   VLM Pipeline

              |                               |

    YOLOv10n ONNX (9MB)            Qwen/Claude/GPT/any LLM

    tiled 2x2 detection              native MLX loading

    ByteTrack tracking               ToMe token compression

              |                       KV cache reuse

              |                               |

              +---------------+---------------+

                              |

              +-------+-------+-------+

              |       |       |       |

          /detect  /describe  /crop   Relay

                              -describe  to Cloud

```

### Trio Cloud integration

When connected to Trio Cloud, Edge is just a lightweight relay:

```

Camera → Edge (RTSP pull + compress) → Trio Cloud (all AI in cloud)

```

Edge sends ~50-100 KB/s per camera. No GPU needed on the edge device.

---

## Configuration

| Variable | Default | Description |

|---|---|---|

| `TRIO_MODEL` | `Qwen3-VL-8B-4bit` | HuggingFace model ID |

| `TRIO_YOLO_MODEL` | (auto-downloaded) | Path to YOLO ONNX model |

| `TRIO_API_KEY` | (none) | Bearer token for API auth |

| `TRIO_CLOUD_URL` | (none) | Trio Cloud API URL for relay mode |

| `TRIO_CLOUD_TOKEN` | (none) | Trio Cloud auth token |

See [`src/trio_edge/config.py`](src/trio_edge/config.py) for all options.

---

## Troubleshooting

| Problem | Solution |

|---|---|

| `trio discover` finds no cameras | Make sure cameras are on the same subnet. Some routers block multicast. |

| Camera found but can't connect | Check username/password. Try `trio cam --rtsp rtsp://admin:pass@IP/stream` directly. |

| Tailscale blocking camera access | Trio Edge auto-detects this and creates a proxy. If it doesn't work, try `trio doctor`. |

| First run slow | Model download (~2-5 GB). Subsequent runs start instantly. |

| Out of memory | Use a smaller model: `TRIO_MODEL=mlx-community/Qwen2.5-VL-3B-Instruct-4bit` |

Run `trio doctor` to diagnose most issues.

---

## License

Apache 2.0 — see [LICENSE](LICENSE).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/machinefi/trio-core

Awesome Lists containing this project

README

Trio Edge