An open API service indexing awesome lists of open source software.

https://github.com/getsentry/statsdproxy

A proxy for transforming, pre-aggregating and routing statsd metrics
https://github.com/getsentry/statsdproxy

hackweek metrics statsd tag-non-production

Last synced: 3 months ago
JSON representation

A proxy for transforming, pre-aggregating and routing statsd metrics

Awesome Lists containing this project

README

        

# statsdproxy

A proxy for transforming, pre-aggregating and routing statsd metrics, like
[Veneur](https://github.com/stripe/veneur), [Vector](https://vector.dev/) or
[Brubeck](https://github.com/github/brubeck).

Currently supports the following transformations:

* Deny- or allow-listing of specific tag keys or metric names
* Adding hardcoded tags to all metrics
* Basic cardinality limiting, tracking the number of distinct tag values per
key or the number of overall timeseries (=combinations of metrics and tags).

See `example.yml` for details.

A major goal is minimal overhead and **no loss of information** due to
unnecessarily strict parsing. Statsdproxy intends to orient itself around
[dogstatsd](https://docs.datadoghq.com/developers/dogstatsd/datagram_shell/?tab=metrics)
protocol but should gracefully degrade for other statsd dialects, in that those
metrics and otherwise unparseable bytes will be forwarded as-is.

**This is not a Sentry product**, not deployed in any sort of production
environment, but a side-project done during Hackweek.

## Basic usage

1. Run a "statsd server" on port 8081 that just prints metrics

```
socat -u UDP-RECVFROM:8081,fork SYSTEM:"cat; echo"
```

2. Copy `example.yaml` to `config.yaml` and edit it
3. Run statsdproxy to read metrics from port 8080, transform them using the
middleware in `config.yaml` and forward the new metrics to port 8081:

```
cargo run --release -- --listen 127.0.0.1:8080 --upstream 127.0.0.1:8081 -c config.yaml
```

5. Send metrics to statsdproxy:

```
yes 'users.online:1|c|@0.5' | nc -u 127.0.0.1 8080
```

4. You should see new metrics in `socat` with your middlewares applied.

## Usage with Snuba

Patch the following settings in `snuba/settings/__init__.py`:

```python
DOGSTATSD_HOST = "127.0.0.1"
DOGSTATSD_PORT = "8080"
```

This will send metrics to port 8080.

## Processing model

This is the processing model used by the provided server. It should be respected
by any usage of this software as a library.

* The server receives metrics as bytes over udp, either singly or several joined
with `\n`.
* For every metric received, the server invokes the `poll` method of the topmost
middleware.
* The middleware may use this invocation to do any needed internal
bookkeeping.
* The middleware should then invoke the `poll` method of the next
middleware, if any.
* Once `poll` returns, the server invokes the `submit` method of the topmost
middleware with a mutable reference to the current metric.
* The middleware should process the metric.
* If processing was successful, and if appropriate to its function
(eg. a metric aggregator might hold onto metrics), the middleware
should `submit` the processed metric to the next middleware, returning
the result of this call.
* If processing was unsuccessful (eg. unknown StatsD dialect), the
unchanged metric should be treated as the processed metric, and passed
on or held as above.
* If a middleware becomes unable to handle more metrics during
processing, such that it cannot handle the current metric, it should
return `Overloaded`.
* If an overload is indicated, the server shall pause (TODO: how long)
before calling `submit` again with the same metric. (If an overload is
indicated too many times, maybe drop the metric?)
* Separately, if no metric is received by the server for 1 second, it will
invoke the `poll` method of the topmost middleware. This invocation of `poll`
should be handled the same as above.