{"id":18767773,"url":"https://github.com/lightstep/go-expohisto","last_synced_at":"2025-04-13T06:32:31.059Z","repository":{"id":61627278,"uuid":"546290638","full_name":"lightstep/go-expohisto","owner":"lightstep","description":"Golang implementation of the OpenTelemetry auto-scaling base-2-exponential histogram","archived":false,"fork":false,"pushed_at":"2024-03-07T23:53:18.000Z","size":56,"stargazers_count":9,"open_issues_count":2,"forks_count":2,"subscribers_count":7,"default_branch":"main","last_synced_at":"2025-03-26T23:21:56.006Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lightstep.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-10-05T21:14:08.000Z","updated_at":"2024-09-08T05:25:34.000Z","dependencies_parsed_at":"2022-10-18T17:45:29.555Z","dependency_job_id":null,"html_url":"https://github.com/lightstep/go-expohisto","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lightstep%2Fgo-expohisto","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lightstep%2Fgo-expohisto/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lightstep%2Fgo-expohisto/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lightstep%2Fgo-expohisto/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lightstep","download_url":"https://codeload.github.com/lightstep/go-expohisto/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248674678,"owners_count":21143760,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-07T19:08:30.730Z","updated_at":"2025-04-13T06:32:27.270Z","avatar_url":"https://github.com/lightstep.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Base-2 Exponential Histogram\n\n## Design\n\nThis is a fixed-size data structure for aggregating the OpenTelemetry\nbase-2 exponential histogram introduced in [OTEP\n149](https://github.com/open-telemetry/oteps/blob/main/text/0149-exponential-histogram.md)\nand [described in the metrics data\nmodel](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/data-model.md#exponentialhistogram).\nThe exponential histogram data point is characterized by a `scale`\nfactor that determines resolution.  Positive scales correspond with\nmore resolution, and negatives scales correspond with less resolution.\n\nGiven a maximum size, in terms of the number of buckets, the\nimplementation determines the best scale possible given the set of\nmeasurements received.  The size of the histogram is configured using\nthe `WithMaxSize()` option, which defaults to 160.\n\nThe implementation here maintains the best resolution possible.  Since\nthe scale parameter is shared by the positive and negative ranges, the\nbest value of the scale parameter is determined by the range with the\ngreater difference between minimum and maximum bucket index:\n\n```golang\nfunc bucketsNeeded(minValue, maxValue float64, scale int32) int32 {\n\treturn bucketIndex(maxValue, scale) - bucketIndex(minValue, scale) + 1\n}\n\nfunc bucketIndex(value float64, scale int32) int32 {\n\treturn math.Log(value) * math.Ldexp(math.Log2E, scale)\n}\n```\n\nThe best scale is uniquely determined when `maxSize/2 \u003c\nbucketsNeeded(minValue, maxValue, scale) \u003c= maxSize`.  This\nimplementation maintains the best scale by rescaling as needed to stay\nwithin the maximum size.\n\n## Layout\n\n### Mapping function\n\nThe `mapping` sub-package contains the equations specified in the [data\nmodel for Exponential Histogram data\npoints](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/data-model.md#exponentialhistogram).\n\nThere are two mapping functions used, depending on the sign of the\nscale.  Negative and zero scales use the `mapping/exponent` mapping\nfunction, which computes the bucket index directly from the bits of\nthe `float64` exponent.  This mapping function is used with scale `-10\n\u003c= scale \u003c= 0`.  Scales smaller than -10 map the entire normal\n`float64` number range into a single bucket, thus are not considered\nuseful.\n\nThe `mapping/logarithm` mapping function uses `math.Log(value)` times\nthe scaling factor `math.Ldexp(math.Log2E, scale)`.  This mapping\nfunction is used with `0 \u003c scale \u003c= 20`.  The maximum scale is\nselected because at scale 21, simply, it becomes difficult to test\ncorrectness--at this point `math.MaxFloat64` maps to index\n`math.MaxInt32` and the `math/big` logic used in testing breaks down.\n\n### Data structure\n\nThe `structure` sub-package contains a Histogram aggregator for use by\nthe OpenTelemetry-Go Metrics SDK as well as OpenTelemetry Collector\nreceivers, processors, and exporters.\n\n## Implementation\n\nThe implementation maintains a slice of buckets and grows the array in\nsize only as necessary given the actual range of values, up to the\nmaximum size.  The structure of a single range of buckets is:\n\n```golang\ntype buckets struct {\n\tbacking    bucketsVarwidth[T]  // for T = uint8 | uint16 | uint32 | uint64\n\tindexBase  int32\n\tindexStart int32\n\tindexEnd   int32\n}\n```\n\nThe `backing` field is a generic slice of `[]uint8`, `[]uint16`,\n`[]uint32`, or `[]uint64`.\n\nThe positive and negative backing arrays are independent, so the\nmaximum space used for `buckets` by one `Aggregator` is twice the\nconfigured maximum size.\n\n### Backing array\n\nThe backing array is circular.  The first observation is counted in\nthe 0th index of the backing array and the initial bucket number is\nstored in `indexBase`.  After the initial observation, the backing\narray grows in either direction (i.e., larger or smaller bucket\nnumbers), until rescaling is necessary.  This mechanism allows the\nhistogram to maintain the ideal scale without shifting values inside\nthe array.\n\nThe `indexStart` and `indexEnd` fields store the current minimum and\nmaximum bucket number.  The initial condition is `indexBase ==\nindexStart == indexEnd`, representing a single bucket.\n\nFollowing the first observation, new observations may fall into a\nbucket up to `size-1` in either direction.  Growth is possible by\nadjusting either `indexEnd` or `indexStart` as long as the constraint\n`indexEnd-indexStart \u003c size` remains true.\n\nBucket numbers in the range `[indexBase, indexEnd]` are stored in the\ninterval `[0, indexEnd-indexBase]` of the backing array.  Buckets in\nthe range `[indexStart, indexBase-1]` are stored in the interval\n`[size+indexStart-indexBase, size-1]` of the backing array.\n\nConsidering the `aggregation.Buckets` interface, `Offset()` returns\n`indexStart`, `Len()` returns `indexEnd-indexStart+1`, and `At()`\nlocates the correct bucket in the circular array.\n\n### Determining change of scale\n\nThe algorithm used to determine the (best) change of scale when a new\nvalue arrives is:\n\n```golang\nfunc newScale(minIndex, maxIndex, scale, maxSize int32) int32 {\n    return scale - changeScale(minIndex, maxIndex, scale, maxSize)\n}\n\nfunc changeScale(minIndex, maxIndex, scale, maxSize int32) int32 {\n    var change int32\n    for maxIndex - minIndex \u003e= maxSize {\n\t   maxIndex \u003e\u003e= 1\n\t   minIndex \u003e\u003e= 1\n\t   change++\n    }\n\treturn change\n}\n```\n\nThe `changeScale` function is also used to determine how many bits to\nshift during `Merge`.\n\n### Downscale function\n\nThe downscale function rotates the circular backing array so that\n`indexStart == indexBase`, using the \"3 reversals\" method, before\ncombining the buckets in place.\n\n### Merge function\n\n`Merge` first calculates the correct final scale by comparing the\ncombined positive and negative ranges.  The destination aggregator is\nthen downscaled, if necessary, and the `UpdateByIncr` code path to add\nthe source buckets to the destination buckets.\n\n### Scale function\n\nThe `Scale` function returns the current scale of the histogram.\n\nIf the scale is variable and there are no non-zero values in the\nhistogram, the scale is zero by definition; when there is only a\nsingle value in this case, its scale is MinScale (20) by definition.\n\nIf the scale is fixed because of range limits, the fixed scale will be\nreturned even for any size histogram.\n\n### Handling subnormal values\n\nSubnormal values are those in the range [0x1p-1074, 0x1p-1022), these\nbeing numbers that \"gradually underflow\" and use less than 52 bits of\nprecision in the significand at the smallest representable exponent\n(i.e., -1022).  Subnormal numbers present special challenges for both\nthe exponent- and logarithm-based mapping function, and to avoid\nadditional complexity induced by corner cases, subnormal numbers are\nrounded up to 0x1p-1022 in this implementation.\n\nHandling subnormal numbers is difficult for the logarithm mapping\nfunction because Golang's `math.Log()` function rounds subnormal\nnumbers up to 0x1p-1022.  Handling subnormal numbers is difficult for\nthe exponent mapping function because Golang's `math.Frexp()`, the\nnatural API for extracting a value's base-2 exponent, also rounds\nsubnormal numbers up to 0x1p-1022.\n\nWhile the additional complexity needed to correctly map subnormal\nnumbers is small in both cases, there are few real benefits in doing\nso because of the inherent loss of precision.  As secondary\nmotivation, clamping values to the range [0x1p-1022, math.MaxFloat64]\nincreases symmetry. This limit means that minimum bucket index and the\nmaximum bucket index have similar magnitude, which helps support\ngreater maximum scale.  Supporting numbers smaller than 0x1p-1022\nwould mean changing the valid scale interval to [-11,19] compared with\n[-10,20].\n\n### UpdateByIncr interface\n\nThe OpenTelemetry metrics SDK `Aggregator` type supports an `Update()`\ninterface which implies updating the histogram by a count of 1.  This\nimplementation also supports `UpdateByIncr()`, which makes it possible\nto support counting multiple observations in a single API call.  This\nextension is useful in applying `Histogram` aggregation to _sampled_\nmetric events (e.g. in the [OpenTelemetry statsd\nreceiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/statsdreceiver)).\n\nAnother use for `UpdateByIncr` is in a Span-to-metrics pipeline\nfollowing [probability sampling in OpenTelemetry tracing\n(WIP)](https://github.com/open-telemetry/opentelemetry-specification/pull/2047).\n\n## Acknowledgements\n\nThis implementation is based on work by [Yuke\nZhuge](https://github.com/yzhuge) and [Otmar\nErtl](https://github.com/oertl).  See\n[NrSketch](https://github.com/newrelic-experimental/newrelic-sketch-java/blob/1ce245713603d61ba3a4510f6df930a5479cd3f6/src/main/java/com/newrelic/nrsketch/indexer/LogIndexer.java)\nand\n[DynaHist](https://github.com/dynatrace-oss/dynahist/blob/9a6003fd0f661a9ef9dfcced0b428a01e303805e/src/main/java/com/dynatrace/dynahist/layout/OpenTelemetryExponentialBucketsLayout.java)\nrepositories for more detail.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flightstep%2Fgo-expohisto","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flightstep%2Fgo-expohisto","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flightstep%2Fgo-expohisto/lists"}