https://github.com/rstudio-tech/gost
Gost is a Go implementation of the StatsD daemon.
https://github.com/rstudio-tech/gost
Last synced: 11 months ago
JSON representation
Gost is a Go implementation of the StatsD daemon.
- Host: GitHub
- URL: https://github.com/rstudio-tech/gost
- Owner: rstudio-tech
- License: mit
- Created: 2024-12-05T09:16:13.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-12-05T09:16:54.000Z (over 1 year ago)
- Last Synced: 2025-07-02T17:55:28.316Z (11 months ago)
- Language: Go
- Size: 220 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# gost
Gost is a Go implementation of the [StatsD](https://github.com/etsy/statsd/)
daemon.
## Usage
Install from source:
go install github.com/cespare/gost@latest
Run `gost` with a conf file.
gost -conf my/config.toml
By default it uses `conf.toml`. This repo includes a [`conf.toml`](conf.toml)
that should get you started. It has a lot of comments that explain what all the
options are.
### Messages
Gost is largely statsd compatible and any statsd library you want should work
with it out of the box. The main API difference is that gauges cannot be delta
values (they are always interpreted as absolute).
For completeness, here is a summary of the supported messages. All messages are
sent via UDP to localhost on a port configured by the `port` setting in the
config file. Typically each message is a UDP packet, but multiple messages can
be sent in a single packet by separating them with `\n` characters.
There are two data types involved: **keys** and **values**. **keys** are ascii
strings (see the Key Format section below for details). **values** are
human-printed floats:
/^[+\-]?\d+(\.\d+)?$/
Counters have a sampling rate, which is the same format as a value. This tells
gost that the counter is being sampled at some rate, and gost divides the
counter value by the sampling rate to obtain an estimate of the true value.
**Counters**
A counter records occurrences of some event, or other values that can be
accumulated by summing them.
For each counter, gost records two metrics:
* `count`: the raw counts (scaled for sample rate)
* `rate`: the rate, per second
Syntax: `:|c(|@)?`
Examples:
rails.requests:1|c
page_hits:135|c|@0.1
**Timers**
Timers are for measuring the elapsed time of some operation. These are more
complex than the other kinds of stats. For each timer key, gost records the
following metrics during each flush period:
* `timer.count`: the number of timer calls that have been recorded
* `timer.rate`: the rate at which timer calls came in, per second
* `timer.min`, `timer.max`: the min and max values of the timer during the flush
interval
* `timer.mean`, `timer.median`, `timer.stdev`: the mean, median, and standard
deviation, respectively, of the timer values during the flush interval
* `timer.sum`: the total sum of all timer values during the interval. This
value, in concert with `timer.count`, can be used (by some other system) to
compute mean values across flush buckets.
Syntax: `:|ms`
Example: `s3_backup:1411|ms`
**Gauges**
A gauge is simply a value that varies over time. The most recent value of the
gauge is the result gost emits during each flush.
Syntax: `:|g`
Example: `active_users:992|g`
**Sets**
A set records the unique occurrences of some value. The metric sent to graphite
is the number of unique values that were given under a particular key during a
flush interval.
Syntax: `:|s`
Example: `user_id:135|s`
### Meta-stats
Gost sends back some stats about itself to graphite as well. This includes:
* `gost.bad_messages_seen`: a counter for the number of malformed messages gost
has received
* `gost.packets_received`: a counter for the number of packets gost has read
* `gost.distinct_metrics_flushed`: a gauge for the number of stats sent to
graphite during this flush
* `gost.distinct_forwarded_metrics_flushed`: a gauge for the number of stats
forwarded to another gost during this flush (see Counter Forwarding, below)
There are some other counters for various error conditions. Most of these also
show up in the stderr logs.
### OS stats
One nice feature of gost is that, if you're running on a Linux system, it can
automatically send back statistics about the host, including memory, CPU,
network, and disk information. See [the example configuration file](conf.toml)
for how to set this up, and detailed information on what counters are sent.
### Script stats
Gost is able to consume messages via scripts that emit statsd-formatted messages
to stdout. See [the configuration file](conf.toml) for the options to specify
the script directory and the interval between runs.
Each run interval, gost tries to list the script directory. For each regular
file in that directory, gost tries to run it as an executable (all at the same
time). The output is read line by line and each is parsed as a statsd message.
If one line is unable to be parsed, gost stops trying to parse the output of
that script. If the execution takes so long that the next run interval passes,
that script is not started again until it is finished (so at most one copy of
each script is running at once).
The scripts are executed with no arguments from gost's current directory. Stdin
and stderr are null devices. Only stdout is used. Scripts must be executable.
Any errors running the script (including a non-zero exit status) trigger
debugging output and meta-stats.
### Debug interface
The `debug_port` setting controls the port of a local server that gost starts up
for debugging. Gost will print its (UDP) input and (Graphite) output via TCP to
any client that connects to this port. So if you're using `debug_port = 8126` as
in the example config, then you can connect like this:
$ telnet localhost 8126
and you will see gost's input and output. This is very handy for debugging. You
may want to filter out just a subset of the data; for instance:
$ nc localhost 8127 | grep '\[out\]' # just outbound messages
## Key Format
Gost message keys are formed from printable ascii characters with a few
restrictions, listed below. The maximum size of an accepted UDP packet (which
usually contains one message but may contain several separated by `\n`) is 10Kb;
this sets the only limit on key length.
| source char | converted to | reason |
| ----------- | -------------- | ------------------------------------------------------- |
| newline | error | newlines end gost messages |
| `:` | error | colons end gost keys |
| space | `_` | graphite uses space in its message format |
| `/` | `-` | graphite can't handle `/` (keys are filenames) |
| `<`, `>` | removed | graphite doesn't handle `<` (`>` excluded for symmetry) |
| `*` | removed | graphite uses `*` as a wildcard |
| `[`, `]` | removed | graphite uses `[...]` for char set matching |
| `{`, `}`, | removed | graphite uses `{...}` for matching multiple items |
Additionally, note that a trailing `.` on a key will be ignored by Graphite, so
`foo.` is the same as `foo`.
## Counter Forwarding
Instead of sending to graphite, gost can forward metrics (counters only) to
another gost, which in turn sends to graphite.
Enable forwarding by setting the `forwarding_addr` option to the network address
of the gost to which to forward. Then to forward a counter, prefix it with `f|`:
f|web.requests:1|c
This counter will not be flushed to graphite, but will be sent to the forwarder
gost.
To enable gost to act as a forwarder (that is, it will accept forwarded messages
in addition to normal UDP messages), set the `forwarder_listen_addr` to the bind
address to use to listen for forwarded messages. You can also use the
`forwarded_namespace` setting to control the namespace applied to forwarded
stats.
**Motivation:** It's inconvenient to always have to sum your graphite queries
across all your servers -- often you only care about the global count. But
graphite doesn't add together counters for you when it ingests them. To get
around this, some folks run a network topology where they forward all their
metrics into a single statsd across the network. This has some big downsides:
* It's lossy (UDP)
* The QPS is really limited in a setup like this
* It's a lot of network traffic
With counter forwarding, you can get a lot of the advantages without the
disadvantages:
* Gost-to-gost forwarding is over TCP
* Gosts only flush once every N milliseconds, so the raw stats don't cross the
network
* The forwarding protocol is an efficient binary format
* You can handle a large volume of metrics this way
Of course, this is still a single point of failure in your metrics collection
system, but if you're using Graphite you've probably got that anyway.
I suggest you host your forwarder gost instance alongside Graphite.
It is possible for a gost instance to forward counters to itself.
## Tuning
If you're trying to push a lot of stats into gost, it may start dropping
messages. This may be because your OS is using a very limited amount of buffer
for the UDP socket. You can typically tune this; on my linux system, for
instance, I can bump the limits by using sysctl:
```
# 25MiB read and write buffer sizes
net.core.rmem_max=26214400
net.core.rmem_default=26214400
net.core.wmem_max=26214400
net.core.wmem_default=26214400
```
With such of tuning your gost instance should be able to easily handle hundreds
of thousands of messages per second on moderate hardware.
(This will, of course, incur a lot of system load and typically you'll want to
use sampling to limit the gost qps to something reasonable.)
## Differences with StatsD
* Statsd only allows keys matching `/^[a-zA-Z0-9\-_\.]+$/`; gost is more
permissive (see Key Format, above).
* Gauges cannot be deltas; they must be absolute values.
* Timers don't return as much information as in statsd, and they're not
customizable.
* gost can record OS stats from the host and deliver them to graphite as well.
* The "meta-stats" gost sends back are different from StatsD (there are a lot
fewer of them)
* Gost is very fast. It can handle several times the load statsd can before
dropping messages. In my unscientific tests on my Linux dev machine, I got
statsd up to about 80k qps before it started dropping messages, while gost got
to 350k+ qps without dropping any messages.