https://github.com/getsentry/statsdproxy
A proxy for transforming, pre-aggregating and routing statsd metrics
https://github.com/getsentry/statsdproxy
hackweek metrics statsd tag-non-production
Last synced: 3 months ago
JSON representation
A proxy for transforming, pre-aggregating and routing statsd metrics
- Host: GitHub
- URL: https://github.com/getsentry/statsdproxy
- Owner: getsentry
- Created: 2023-08-21T13:04:36.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2025-01-24T10:27:18.000Z (5 months ago)
- Last Synced: 2025-03-22T06:47:11.245Z (3 months ago)
- Topics: hackweek, metrics, statsd, tag-non-production
- Language: Rust
- Homepage:
- Size: 111 KB
- Stars: 5
- Watchers: 32
- Forks: 1
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project
README
# statsdproxy
A proxy for transforming, pre-aggregating and routing statsd metrics, like
[Veneur](https://github.com/stripe/veneur), [Vector](https://vector.dev/) or
[Brubeck](https://github.com/github/brubeck).Currently supports the following transformations:
* Deny- or allow-listing of specific tag keys or metric names
* Adding hardcoded tags to all metrics
* Basic cardinality limiting, tracking the number of distinct tag values per
key or the number of overall timeseries (=combinations of metrics and tags).See `example.yml` for details.
A major goal is minimal overhead and **no loss of information** due to
unnecessarily strict parsing. Statsdproxy intends to orient itself around
[dogstatsd](https://docs.datadoghq.com/developers/dogstatsd/datagram_shell/?tab=metrics)
protocol but should gracefully degrade for other statsd dialects, in that those
metrics and otherwise unparseable bytes will be forwarded as-is.**This is not a Sentry product**, not deployed in any sort of production
environment, but a side-project done during Hackweek.## Basic usage
1. Run a "statsd server" on port 8081 that just prints metrics
```
socat -u UDP-RECVFROM:8081,fork SYSTEM:"cat; echo"
```2. Copy `example.yaml` to `config.yaml` and edit it
3. Run statsdproxy to read metrics from port 8080, transform them using the
middleware in `config.yaml` and forward the new metrics to port 8081:```
cargo run --release -- --listen 127.0.0.1:8080 --upstream 127.0.0.1:8081 -c config.yaml
```5. Send metrics to statsdproxy:
```
yes 'users.online:1|c|@0.5' | nc -u 127.0.0.1 8080
```4. You should see new metrics in `socat` with your middlewares applied.
## Usage with Snuba
Patch the following settings in `snuba/settings/__init__.py`:
```python
DOGSTATSD_HOST = "127.0.0.1"
DOGSTATSD_PORT = "8080"
```This will send metrics to port 8080.
## Processing model
This is the processing model used by the provided server. It should be respected
by any usage of this software as a library.* The server receives metrics as bytes over udp, either singly or several joined
with `\n`.
* For every metric received, the server invokes the `poll` method of the topmost
middleware.
* The middleware may use this invocation to do any needed internal
bookkeeping.
* The middleware should then invoke the `poll` method of the next
middleware, if any.
* Once `poll` returns, the server invokes the `submit` method of the topmost
middleware with a mutable reference to the current metric.
* The middleware should process the metric.
* If processing was successful, and if appropriate to its function
(eg. a metric aggregator might hold onto metrics), the middleware
should `submit` the processed metric to the next middleware, returning
the result of this call.
* If processing was unsuccessful (eg. unknown StatsD dialect), the
unchanged metric should be treated as the processed metric, and passed
on or held as above.
* If a middleware becomes unable to handle more metrics during
processing, such that it cannot handle the current metric, it should
return `Overloaded`.
* If an overload is indicated, the server shall pause (TODO: how long)
before calling `submit` again with the same metric. (If an overload is
indicated too many times, maybe drop the metric?)
* Separately, if no metric is received by the server for 1 second, it will
invoke the `poll` method of the topmost middleware. This invocation of `poll`
should be handled the same as above.