https://github.com/anoni-net/onionoo-fastapi
Semantic/OpenAPI proxy for the Tor Metrics Onionoo API, built with FastAPI for easier integration and automated analysis.
https://github.com/anoni-net/onionoo-fastapi
agentic-ai ai-agents data-analysis fastapi network-metrics observability onionoo openapi privacy pydantic python semantic-apis tor tor-metrics
Last synced: about 1 month ago
JSON representation
Semantic/OpenAPI proxy for the Tor Metrics Onionoo API, built with FastAPI for easier integration and automated analysis.
- Host: GitHub
- URL: https://github.com/anoni-net/onionoo-fastapi
- Owner: anoni-net
- License: mit
- Created: 2026-01-31T19:28:00.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2026-05-15T05:20:33.000Z (about 1 month ago)
- Last Synced: 2026-05-15T07:33:23.279Z (about 1 month ago)
- Topics: agentic-ai, ai-agents, data-analysis, fastapi, network-metrics, observability, onionoo, openapi, privacy, pydantic, python, semantic-apis, tor, tor-metrics
- Language: Python
- Homepage: https://onionoo.anoni.net/docs
- Size: 285 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# onionoo-fastapi
FastAPI-based **semantic/OpenAPI proxy** for the Tor **Onionoo** API.
- GitHub:
- Upstream data source:
- This service **does not store Onionoo data**, it only forwards requests and transforms responses.
- Primary motivation: Onionoo has a solid spec, but **no OpenAPI**; this service provides a friendly schema **for tooling/AI agents**.
Reference spec: [Tor Metrics – Onionoo](https://metrics.torproject.org/onionoo.html)
## Hosted instance
- Service: `https://onionoo.anoni.net`
- Swagger UI: `https://onionoo.anoni.net/docs`
## Releases
Tagged releases (`vX.Y.Z`) trigger two GitHub Actions workflows:
- `.github/workflows/release.yml` — builds the wheel/sdist and publishes to
PyPI via Trusted Publishing. Register this workflow as a trusted publisher
on the `onionoo-fastapi` PyPI project once before the first tag.
- `.github/workflows/docker.yml` — builds a multi-arch image and pushes to
`ghcr.io//onionoo-fastapi`. Uses the default `GITHUB_TOKEN`.
Cut a release with:
```bash
git tag -a v0.2.0 -m "Release 0.2.0"
git push origin v0.2.0
```
## License
MIT. See `LICENSE`.
## Requirements
- Python 3.11+
- [`uv`](https://docs.astral.sh/uv/)
## Install
```bash
git clone https://github.com/anoni-net/onionoo-fastapi
cd onionoo-fastapi
uv sync
```
Or if you already have the source:
```bash
cd onionoo-fastapi
uv sync
```
## Run
```bash
fastapi run app.main:app --reload --host 0.0.0.0 --port 8000
```
**Note:** `fastapi run` requires FastAPI version 0.110.0 or newer.
OpenAPI docs:
- Swagger UI: `http://localhost:8000/docs`
- OpenAPI JSON: `http://localhost:8000/openapi.json`
## Test
```bash
uv sync --extra dev
uv run pytest
```
## Docker
Build and run with Docker Compose:
```bash
docker compose up -d --build
```
If port 8000 is already in use, override host port (example: 8001):
```bash
HOST_PORT=8001 docker compose up -d --build
```
Stop:
```bash
docker compose down
```
Configuration via environment variables (example):
```bash
ONIONOO_BASE_URL=https://onionoo.torproject.org HOST_PORT=8001 docker compose up -d --build
```
## API
This service exposes semantic endpoints under `/v1/*`:
- `GET /v1/summary`
- `GET /v1/details`
- `GET /v1/bandwidth`
- `GET /v1/weights`
- `GET /v1/clients`
- `GET /v1/uptime`
Aggregate (server-side group-by, sorted by relay count):
- `GET /v1/aggregate/countries` — buckets by two-letter country code
- `GET /v1/aggregate/as` — buckets by autonomous system number
- `GET /v1/aggregate/flags` — buckets by directory-authority flag (a relay can fall into multiple flag buckets)
Plus:
- `GET /healthz` — static liveness
- `GET /healthz/ready` — verifies upstream reachability (cached)
- `GET /metrics` — Prometheus format
### Example requests
```bash
# Summary (semantic keys; upstream short keys are transformed)
curl -s 'http://localhost:8000/v1/summary?limit=1' | jq .
# Details (supports Onionoo query parameters + details-only `fields`)
curl -s 'http://localhost:8000/v1/details?limit=1&search=moria&fields=nickname,fingerprint' | jq .
# Bandwidth
curl -s 'http://localhost:8000/v1/bandwidth?limit=1&search=moria' | jq .
# Weights (relays only)
curl -s 'http://localhost:8000/v1/weights?limit=1&search=moria' | jq .
# Clients (bridges only)
curl -s 'http://localhost:8000/v1/clients?limit=1' | jq .
# Uptime
curl -s 'http://localhost:8000/v1/uptime?limit=1&search=moria' | jq .
```
### Semantic field mapping notes
- `/v1/summary` transforms Onionoo short keys:
- relay: `n,f,a,r` → `nickname,fingerprint,addresses,running`
- bridge: `n,h,r` → `nickname,hashed_fingerprint,running`
- For some bridge documents (`/bandwidth`, `/clients`, `/uptime`), Onionoo uses the key name `fingerprint` even though the value is a **hashed fingerprint**; this API exposes that as `hashed_fingerprint`.
### Caching / 304 behavior
If the client includes `If-Modified-Since`, it will be forwarded upstream. If Onionoo replies with `304`, this service will reply `304` too.
### Configuration
Upstream / cache:
- `ONIONOO_BASE_URL` (default: `https://onionoo.torproject.org`)
- `ONIONOO_TIMEOUT_SECONDS` (default: `30`)
- `DEFAULT_LIMIT` (default: `100`)
- `MAX_LIMIT` (default: `200`)
- `USER_AGENT`
- `CACHE_MAXSIZE` (default: `1024`)
- `CACHE_DEFAULT_TTL_SECONDS` (default: `300`)
- `UPSTREAM_RETRY_ATTEMPTS` (default: `2`)
Observability / production hardening:
- `LOG_LEVEL` (default: `INFO`)
- `LOG_FORMAT` (`json` or `console`, default `json`)
- `METRICS_ENABLED` (default: `true`) — exposes `/metrics` in Prometheus format
- `CORS_ALLOWED_ORIGINS` (default: empty, CORS disabled). Example: `["https://example.com"]`
- `RATE_LIMIT_ENABLED` (default: `false`)
- `RATE_LIMIT_PER_MINUTE` (default: `120`)
- `HEALTHZ_READY_CACHE_SECONDS` (default: `30`)
### Resource sizing
A single-worker container (the default `uvicorn` CMD in the Dockerfile) measured on Alpine 3.23 / Python 3.14 / aarch64 against the real Onionoo upstream:
| Phase | RSS | Notes |
|---|---:|---|
| **Idle** (just after start) | ~75 MiB | Python + FastAPI + Pydantic + httpx + fastapi-mcp + structlog + Prometheus instrumentator loaded; cache empty. |
| **Typical agent traffic** | ~90 MiB | After ~15 mixed `/v1/*` calls (details + aggregates), only a handful of distinct upstream payloads cached. |
| **Cache near saturation** | ~180 MiB | After 200 distinct `/v1/details` queries with `fields=` projection; cache holds ~200 entries. |
From these measurements, each cached entry costs **~0.5 MiB on average** when callers use the `fields=` projection. With the default `CACHE_MAXSIZE=1024` that yields a **~500 MiB upper bound** under realistic agent traffic.
If you expect callers to hit `/v1/details` **without** `fields=`, a single response can be several MiB (Onionoo returns ~10k full relay objects). A fully saturated cache of unfiltered details would then sit in the **1–5 GiB** range — bound it by tuning `CACHE_MAXSIZE` down.
Suggested memory limits for `docker run --memory` / Kubernetes requests:
| Deployment shape | Memory request | Memory limit |
|---|---:|---:|
| Personal / single-agent test | 128 MiB | 256 MiB |
| Hosted instance, mostly cached requests | 256 MiB | 512 MiB |
| Public instance, agents may issue unfiltered `/details` | 512 MiB | 1–2 GiB |
CPU is light — a single worker handles 10s of QPS comfortably; scale with replicas if you need more throughput. (`uvicorn ... --workers N` is also an option, but each worker keeps its own in-memory cache; horizontal scaling via separate containers is usually a better fit.)
### Health checks
- `GET /healthz` — static liveness probe, never hits upstream.
- `GET /healthz/ready` — pings Onionoo (`summary?limit=1`); 200 when reachable, 503 otherwise. Result is cached for `HEALTHZ_READY_CACHE_SECONDS`.
### Request tracing
Every request is assigned an `X-Request-ID`. Clients may supply one to correlate across systems; the same value is echoed back on the response and bound into every log record produced during the request.
### Metrics
`/metrics` exposes Prometheus-format counters / histograms, including:
- `onionoo_cache_hits_total`, `onionoo_cache_misses_total`
- `onionoo_upstream_seconds{method=...}` (histogram)
- `onionoo_upstream_errors_total{method=..., status=...}`
- Standard `http_request_duration_seconds` from the FastAPI instrumentator
### Raw passthrough
For large payloads (`/v1/details`), pass `?raw=true` to skip Pydantic re-validation and forward the upstream JSON verbatim. Trade-off: raw mode does **not** apply semantic key remapping (e.g. on `/v1/summary` you'll see `n,f,a,r` rather than `nickname,fingerprint,addresses,running`) and **no `_meta` block is injected**.
### Response metadata (`_meta`)
Non-raw responses on `/v1/*` include a proxy-injected `_meta` block at the top of the envelope:
```json
{
"_meta": {
"cache_age_seconds": 12.345,
"upstream_last_modified": "Thu, 15 May 2026 12:00:00 GMT"
},
"version": "9.0",
...
}
```
`cache_age_seconds = 0.0` means the response was just fetched from Onionoo. A non-zero value means the proxy served it from its in-memory cache.
### Trimming payloads with `fields=`
All `/v1/*` endpoints accept `?fields=a,b,c`. On `/summary` and `/details` Onionoo applies the projection at the upstream level; on history endpoints (bandwidth, weights, clients, uptime) Onionoo applies it where supported. Using it on large queries can shrink LLM input by an order of magnitude.
## Use as an MCP server
This project ships an [MCP](https://modelcontextprotocol.io) server with two
transports — pick whichever fits your client.
### Tools
**Task-oriented (recommended for agents)**
- `find_relay(query)` — free-form lookup; auto-detects fingerprint, AS, IP, or nickname
- `get_relay_health(fingerprint)` — composite snapshot (details + uptime + bandwidth)
- `top_relays_by_bandwidth(country?, flag?, limit)` — top-N by consensus weight
- `compare_relays(fingerprints)` — parallel side-by-side details
- `country_summary(country)` — running relay count, total bandwidth, flag distribution
**Low-level pass-through (raw Onionoo endpoints)**
- `onionoo_summary`, `onionoo_details`, `onionoo_bandwidth`, `onionoo_weights`,
`onionoo_clients`, `onionoo_uptime` — each takes a `params` dict matching the
[Onionoo query spec](https://metrics.torproject.org/onionoo.html).
**Aggregates**
- `aggregate_relays(group_by="country"|"as"|"flag", running=True, top=N)` — server-side group-by, sorted by relay count.
> Streamable HTTP `/mcp` exposes the six low-level endpoints (`get_summary` …
> `get_uptime`) plus the three aggregate endpoints (`aggregate_countries`,
> `aggregate_as`, `aggregate_flags`). The task-oriented tools and the unified
> `aggregate_relays` live in the stdio server. Both transports can run side by side.
### Streamable HTTP transport (recommended for hosted use)
Run the FastAPI app — `/mcp` is mounted automatically.
Inspect with MCP Inspector:
```bash
npx @modelcontextprotocol/inspector
# Transport: Streamable HTTP
# URL: http://localhost:8000/mcp
```
Claude Desktop / Cursor:
```json
{
"mcpServers": {
"onionoo": {
"type": "http",
"url": "https://onionoo.anoni.net/mcp"
}
}
}
```
### stdio transport (recommended for local agents)
`uv sync` installs an `onionoo-mcp` console script:
```bash
onionoo-mcp
```
Claude Desktop / Cursor:
```json
{
"mcpServers": {
"onionoo": {
"command": "uvx",
"args": ["--from", "/path/to/onionoo-fastapi", "onionoo-mcp"]
}
}
}
```
Or, if the repo is checked out and you have `uv`:
```json
{
"mcpServers": {
"onionoo": {
"command": "uv",
"args": ["--directory", "/path/to/onionoo-fastapi", "run", "onionoo-mcp"]
}
}
}
```