{"id":21824377,"url":"https://github.com/shopify/statsd-instrument","last_synced_at":"2025-05-13T20:12:08.367Z","repository":{"id":40944308,"uuid":"2113660","full_name":"Shopify/statsd-instrument","owner":"Shopify","description":"A StatsD client for Ruby apps. Provides metaprogramming methods to inject StatsD instrumentation into your code.","archived":false,"fork":false,"pushed_at":"2025-03-25T19:29:35.000Z","size":1205,"stargazers_count":582,"open_issues_count":13,"forks_count":98,"subscribers_count":492,"default_branch":"main","last_synced_at":"2025-05-07T23:35:33.441Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"http://shopify.github.io/statsd-instrument","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Shopify.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2011-07-27T15:46:23.000Z","updated_at":"2025-05-05T00:27:46.000Z","dependencies_parsed_at":"2023-02-10T04:15:50.343Z","dependency_job_id":"93712674-ba2f-4c81-ad79-becfc4502714","html_url":"https://github.com/Shopify/statsd-instrument","commit_stats":{"total_commits":647,"total_committers":95,"mean_commits":6.810526315789474,"dds":0.5162287480680061,"last_synced_commit":"4045921348b79b75438369b9fe5a1eb553c3135f"},"previous_names":[],"tags_count":101,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Shopify%2Fstatsd-instrument","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Shopify%2Fstatsd-instrument/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Shopify%2Fstatsd-instrument/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Shopify%2Fstatsd-instrument/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Shopify","download_url":"https://codeload.github.com/Shopify/statsd-instrument/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254020615,"owners_count":22000755,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-27T17:57:44.288Z","updated_at":"2025-05-13T20:12:08.330Z","avatar_url":"https://github.com/Shopify.png","language":"Ruby","readme":"# StatsD client for Ruby apps\n\nThis is a ruby client for statsd (https://github.com/statsd/statsd). It provides\na lightweight way to track and measure metrics in your application.\n\nWe call out to statsd by sending data over a UDP socket. UDP sockets are fast,\nbut unreliable, there is no guarantee that your data will ever arrive at its\nlocation. In other words, fire and forget. This is perfect for this use case\nbecause it means your code doesn't get bogged down trying to log statistics.\nWe send data to statsd several times per request and haven't noticed a\nperformance hit.\n\nFor more information about StatsD, see the [README of the StatsD\nproject](https://github.com/statsd/statsd).\n\n## Configuration\n\nIt's recommended to configure this library by setting environment variables.\nThe following environment variables are supported:\n\n- `STATSD_ADDR`: (default `localhost:8125`) The address to send the StatsD UDP\n  datagrams to.\n- `STATSD_IMPLEMENTATION`: (default: `datadog`). The StatsD implementation you\n  are using. `statsd` and `datadog` are supported. Some features\n  are only available on certain implementations,\n- `STATSD_ENV`: The environment StatsD will run in. If this is not set\n  explicitly, this will be determined based on other environment variables,\n  like `RAILS_ENV` or `ENV`. The library will behave differently:\n\n  - In the **production** and **staging** environment, the library will\n    actually send UDP packets.\n  - In the **test** environment, it will swallow all calls, but allows you to\n    capture them for testing purposes. See below for notes on writing tests.\n  - In **development** and all other environments, it will write all calls to\n    the log (`StatsD.logger`, which by default writes to STDOUT).\n\n- `STATSD_SAMPLE_RATE`: (default: `1.0`) The default sample rate to use for all\n  metrics. This can be used to reduce the amount of network traffic and CPU\n  overhead the usage of this library generates. This can be overridden in a\n  metric method call.\n- `STATSD_PREFIX`: The prefix to apply to all metric names. This can be\n  overridden in a metric method call.\n- `STATSD_DEFAULT_TAGS`: A comma-separated list of tags to apply to all metrics.\n  (Note: tags are not supported by all implementations.)\n- `STATSD_BUFFER_CAPACITY`: (default: `5000`) The maximum amount of events that\n  may be buffered before emitting threads will start to block. Increasing this\n  value may help for application generating spikes of events. However if the\n  application emit events faster than they can be sent, increasing it won't help.\n  If set to `0`, batching will be disabled, and events will be sent in individual\n  UDP packets, which is much slower.\n- `STATSD_FLUSH_INTERVAL`: (default: `1`) Deprecated. Setting this to `0` is\n  equivalent to setting `STATSD_BUFFER_CAPACITY` to `0`.\n- `STATSD_MAX_PACKET_SIZE`: (default: `1472`) The maximum size of UDP packets.\n  If your network is properly configured to handle larger packets you may try\n  to increase this value for better performance, but most network can't handle\n  larger packets.\n- `STATSD_BATCH_STATISTICS_INTERVAL`: (default: \"0\") If non-zero, the `BatchedUDPSink`\n  will track and emit statistics on this interval to the default sink for your environment.\n  The current tracked statistics are:\n\n  - `statsd_instrument.batched_udp_sink.batched_sends`: The number of batches sent, of any size.\n  - `statsd_instrument.batched_udp_sink.synchronous_sends`: The number of times the batched udp sender needed to send a statsd line synchronously, due to the buffer being full.\n  - `statsd_instrument.batched_udp_sink.avg_buffer_length`: The average buffer length, measured at the beginning of each batch.\n  - `statsd_instrument.batched_udp_sink.avg_batched_packet_size`: The average per-batch byte size of the packet sent to the underlying UDPSink.\n  - `statsd_instrument.batched_udp_sink.avg_batch_length`: The average number of statsd lines per batch.\n\n\n### Experimental aggregation feature\n\nThe aggregation feature is currently experimental and aims to improve the efficiency of metrics reporting by aggregating\nmultiple metric events into a single sample. This reduces the number of network requests and can significantly decrease the overhead \nassociated with high-frequency metric reporting.\n\nThis means that instead of sending each metric event individually, the library will aggregate multiple events into a single sample and send it to the StatsD server.\nExample: \n\nInstead of sending counters in multiple packets like this:\n```\nmy.counter:1|c\nmy.counter:1|c\nmy.counter:1|c\n```\n\nThe library will aggregate them into a single packet like this:\n```\nmy.counter:3|c\n```\n\nand for histograms/distributions:\n```\nmy.histogram:1|h\nmy.histogram:2|h\nmy.histogram:3|h\n```\n\nThe library will aggregate them into a single packet like this:\n```\nmy.histogram:1:2:3|h\n```\n\n#### Enabling Aggregation\n\nTo enable metric aggregation, set the following environment variables:\n\n- `STATSD_ENABLE_AGGREGATION`: Set this to `true` to enable the experimental aggregation feature. Aggregation is disabled by default.\n- `STATSD_AGGREGATION_INTERVAL`: Specifies the interval (in seconds) at which aggregated metrics are flushed and sent to the StatsD server. \nFor example, setting this to `2` will aggregate and send metrics every 2 seconds. Two seconds is also the default value if this environment variable is not set.\n\nPlease note that since aggregation is an experimental feature, it should be used with caution in production environments.\n\n\u003e [!WARNING]\n\u003e This feature is only compatible with Datadog Agent's version \u003e=6.25.0 \u0026\u0026 \u003c7.0.0 or Agent's versions \u003e=7.25.0.\n\n## StatsD keys\n\nStatsD keys look like 'admin.logins.api.success'. Dots are used as namespace separators.\n\n## Usage\n\nYou can either use the basic methods to submit stats over StatsD, or you can use the metaprogramming methods to instrument your methods with some basic stats (call counts, successes \u0026 failures, and timings).\n\n#### StatsD.measure\n\nLets you benchmark how long the execution of a specific method takes.\n\n``` ruby\n# You can pass a key and a ms value\nStatsD.measure('GoogleBase.insert', 2.55)\n\n# or more commonly pass a block that calls your code\nStatsD.measure('GoogleBase.insert') do\n  GoogleBase.insert(product)\nend\n```\n\n#### StatsD.increment\n\nLets you increment a key in statsd to keep a count of something. If the specified key doesn't exist it will create it for you.\n\n``` ruby\n# increments default to +1\nStatsD.increment('GoogleBase.insert')\n# you can also specify how much to increment the key by\nStatsD.increment('GoogleBase.insert', 10)\n# you can also specify a sample rate, so only 1/10 of events\n# actually get to statsd. Useful for very high volume data\nStatsD.increment('GoogleBase.insert', sample_rate: 0.1)\n```\n\n#### StatsD.gauge\n\nA gauge is a single numerical value that tells you the state of the system at a point in time. A good example would be the number of messages in a queue.\n\n``` ruby\nStatsD.gauge('GoogleBase.queued', 12, sample_rate: 1.0)\n```\n\nNormally, you shouldn't update this value too often, and therefore there is no need to sample this kind metric.\n\n#### StatsD.set\n\nA set keeps track of the number of unique values that have been seen. This is a good fit for keeping track of the number of unique visitors. The value can be a string.\n\n``` ruby\n# Submit the customer ID to the set. It will only be counted if it hasn't been seen before.\nStatsD.set('GoogleBase.customers', \"12345\", sample_rate: 1.0)\n```\n\nBecause you are counting unique values, the results of using a sampling value less than 1.0 can lead to unexpected, hard to interpret results.\n\n#### StatsD.histogram\n\nBuilds a histogram of numeric values.\n``` ruby\n\nStatsD.histogram('Order.value', order.value_in_usd.to_f, tags: { source: 'POS' })\n```\n\nBecause you are counting unique values, the results of using a sampling value less than 1.0 can lead to unexpected, hard to interpret results.\n\n*Note: This is only supported by the beta datadog implementation.*\n\n#### StatsD.distribution\n\nA modified gauge that submits a distribution of values over a sample period. Arithmetic and statistical calculations (percentiles, average, etc.) on the data set are performed server side rather than client side like a histogram.\n\n```ruby\nStatsD.distribution('shipit.redis_connection', 3)\n```\n\n*Note: This is only supported by the beta datadog implementation.*\n\n#### StatsD.event\n\nAn event is a (title, text) tuple that can be used to correlate metrics with something that occurred within the system.\nThis is a good fit for instance to correlate response time variation with a deploy of the new code.\n\n```ruby\nStatsD.event('shipit.deploy', 'started')\n```\n\n*Note: This is only supported by the [datadog implementation](https://docs.datadoghq.com/guides/dogstatsd/#events).*\n\nEvents support additional metadata such as `date_happened`, `hostname`, `aggregation_key`, `priority`, `source_type_name`, `alert_type`.\n\n#### StatsD.service_check\n\nAn event is a (check_name, status) tuple that can be used to monitor the status of services your application depends on.\n\n```ruby\nStatsD.service_check('shipit.redis_connection', 'ok')\n```\n\n*Note: This is only supported by the [datadog implementation](https://docs.datadoghq.com/guides/dogstatsd/#service-checks).*\n\nService checks support additional metadata such as `timestamp`, `hostname`, `message`.\n\n\n### Metaprogramming Methods\n\nAs mentioned, it's most common to use the provided metaprogramming methods. This lets you define all of your instrumentation in one file and not litter your code with instrumentation details. You should enable a class for instrumentation by extending it with the `StatsD::Instrument` class.\n\n``` ruby\nGoogleBase.extend StatsD::Instrument\n```\n\nThen use the methods provided below to instrument methods in your class.\n\n#### statsd\\_measure\n\nThis will measure how long a method takes to run, and submits the result to the given key.\n\n``` ruby\nGoogleBase.statsd_measure :insert, 'GoogleBase.insert'\n```\n\n#### statsd\\_count\n\nThis will increment the given key even if the method doesn't finish (ie. raises).\n\n``` ruby\nGoogleBase.statsd_count :insert, 'GoogleBase.insert'\n```\n\nNote how I used the 'GoogleBase.insert' key above when measuring this method, and I reused here when counting the method calls. StatsD automatically separates these two kinds of stats into namespaces so there won't be a key collision here.\n\n#### statsd\\_count\\_if\n\nThis will only increment the given key if the method executes successfully.\n\n``` ruby\nGoogleBase.statsd_count_if :insert, 'GoogleBase.insert'\n```\n\nSo now, if GoogleBase#insert raises an exception or returns false (ie. result == false), we won't increment the key. If you want to define what success means for a given method you can pass a block that takes the result of the method.\n\n``` ruby\nGoogleBase.statsd_count_if :insert, 'GoogleBase.insert' do |response|\n  response.code == 200\nend\n```\n\nIn the above example we will only increment the key in statsd if the result of the block returns true. So the method is returning a Net::HTTP response and we're checking the status code.\n\n#### statsd\\_count\\_success\n\nSimilar to statsd_count_if, except this will increment one key in the case of success and another key in the case of failure.\n\n``` ruby\nGoogleBase.statsd_count_success :insert, 'GoogleBase.insert'\n```\n\nSo if this method fails execution (raises or returns false) we'll increment the failure key ('GoogleBase.insert.failure'), otherwise we'll increment the success key ('GoogleBase.insert.success'). Notice that we're modifying the given key before sending it to statsd.\n\nAgain you can pass a block to define what success means.\n\n``` ruby\nGoogleBase.statsd_count_success :insert, 'GoogleBase.insert' do |response|\n  response.code == 200\nend\n```\n\n### Instrumenting Class Methods\n\nYou can instrument class methods, just like instance methods, using the metaprogramming methods. You simply have to configure the instrumentation on the singleton class of the Class you want to instrument.\n\n```ruby\nAWS::S3::Base.singleton_class.statsd_measure :request, 'S3.request'\n```\n\n### Dynamic Metric Names\n\nYou can use a lambda function instead of a string dynamically set\nthe name of the metric. The lambda function must accept two arguments:\nthe object the function is being called on and the array of arguments\npassed.\n\n```ruby\nGoogleBase.statsd_count :insert, lambda{|object, args| object.class.to_s.downcase + \".\" + args.first.to_s + \".insert\" }\n```\n\n### Tags\n\nThe Datadog implementation supports tags, which you can use to slice and dice metrics in their UI. You can specify a list of tags as an option, either standalone tag (e.g. `\"mytag\"`), or key value based, separated by a colon: `\"env:production\"`.\n\n``` ruby\nStatsD.increment('my.counter', tags: ['env:production', 'unicorn'])\nGoogleBase.statsd_count :insert, 'GoogleBase.insert', tags: ['env:production']\n```\n\nIf implementation is not set to `:datadog`, tags will not be included in the UDP packets, and a\nwarning is logged to `StatsD.logger`.\n\nYou can use lambda function that instead of a list of tags to set the metric tags.\nLike the dynamic metric name, the lambda function must accept two arguments:\nthe object the function is being called on and the array of arguments\npassed.\n\n``` ruby\nmetric_tagger = lambda { |object, args| { \"key\": args.first } }\nGoogleBase.statsd_count(:insert, 'GoogleBase.insert', tags: metric_tagger)\n```\n\n\u003e You can only use the dynamic tag while using the instrumentation through metaprogramming methods\n\n## Testing\n\nThis library comes with a module called `StatsD::Instrument::Assertions` and `StatsD::Instrument::Matchers` to help you write tests\nto verify StatsD is called properly.\n\n### minitest\n\n``` ruby\nclass MyTestcase \u003c Minitest::Test\n  include StatsD::Instrument::Assertions\n\n  def test_some_metrics\n    # This will pass if there is exactly one matching StatsD call\n    # it will ignore any other, non matching calls.\n    assert_statsd_increment('counter.name', sample_rate: 1.0) do\n      StatsD.increment('unrelated') # doesn't match\n      StatsD.increment('counter.name', sample_rate: 1.0) # matches\n      StatsD.increment('counter.name', sample_rate: 0.1) # doesn't match\n    end\n\n    # Set `times` if there will be multiple matches:\n    assert_statsd_increment('counter.name', times: 2) do\n      StatsD.increment('unrelated') # doesn't match\n      StatsD.increment('counter.name', sample_rate: 1.0) # matches\n      StatsD.increment('counter.name', sample_rate: 0.1) # matches too\n    end\n  end\n\n  def test_no_udp_traffic\n    # Verifies no StatsD calls occurred at all.\n    assert_no_statsd_calls do\n      do_some_work\n    end\n\n    # Verifies no StatsD calls occurred for the given metric.\n    assert_no_statsd_calls('metric_name') do\n      do_some_work\n    end\n  end\n\n  def test_more_complicated_stuff\n    # capture_statsd_calls will capture all the StatsD calls in the\n    # given block, and returns them as an array. You can then run your\n    # own assertions on it.\n    metrics = capture_statsd_calls do\n      StatsD.increment('mycounter', sample_rate: 0.01)\n    end\n\n    assert_equal 1, metrics.length\n    assert_equal 'mycounter', metrics[0].name\n    assert_equal :c, metrics[0].type\n    assert_equal 1, metrics[0].value\n    assert_equal 0.01, metrics[0].sample_rate\n  end\nend\n\n```\n\n### RSpec\n\n```ruby\nRSpec.configure do |config|\n  config.include StatsD::Instrument::Matchers\nend\n\nRSpec.describe 'Matchers' do\n  context 'trigger_statsd_increment' do\n    it 'will pass if there is exactly one matching StatsD call' do\n      expect { StatsD.increment('counter') }.to trigger_statsd_increment('counter')\n    end\n\n    it 'will pass if it matches the correct number of times' do\n      expect {\n        2.times do\n          StatsD.increment('counter')\n        end\n      }.to trigger_statsd_increment('counter', times: 2)\n    end\n\n    it 'will pass if it matches argument' do\n      expect {\n        StatsD.measure('counter', 0.3001)\n      }.to trigger_statsd_measure('counter', value: be_between(0.29, 0.31))\n    end\n\n    it 'will pass if there is no matching StatsD call on negative expectation' do\n      expect { StatsD.increment('other_counter') }.not_to trigger_statsd_increment('counter')\n    end\n\n    it 'will pass if every statsD call matches its call tag variations' do\n      expect do\n        StatsD.increment('counter', tags: ['variation:a'])\n        StatsD.increment('counter', tags: ['variation:b'])\n      end.to trigger_statsd_increment('counter', times: 1, tags: [\"variation:a\"]).and trigger_statsd_increment('counter', times: 1, tags: [\"variation:b\"])\n    end\n  end\nend\n```\n\n## Notes\n\n### Compatibility\n\nThe library is tested against Ruby 2.3 and higher. We are not testing on\ndifferent Ruby implementations besides MRI, but we expect it to work on other\nimplementations as well.\n\n### Reliance on DNS\n\nOut of the box StatsD is set up to be unidirectional fire-and-forget over UDP. Configuring\nthe StatsD host to be a non-ip will trigger a DNS lookup (i.e. a synchronous TCP round trip).\nThis can be particularly problematic in clouds that have a shared DNS infrastructure such as AWS.\n\n1. Using a hardcoded IP avoids the DNS lookup but generally requires an application deploy to change.\n2. Hardcoding the DNS/IP pair in /etc/hosts allows the IP to change without redeploying your application but fails to scale as the number of servers increases.\n3. Installing caching software such as nscd that uses the DNS TTL avoids most DNS lookups but makes the exact moment of change indeterminate.\n\n\n## Links\n\nThis library was developed for shopify.com and is MIT licensed.\n\n- [API documentation](http://www.rubydoc.info/gems/statsd-instrument)\n- [The changelog](./CHANGELOG.md) covers the changes between releases.\n- [Contributing notes](./CONTRIBUTING.md) if you are interested in contributing to this library.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshopify%2Fstatsd-instrument","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshopify%2Fstatsd-instrument","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshopify%2Fstatsd-instrument/lists"}