{"id":21332700,"url":"https://github.com/jmacd/go-sampler","last_synced_at":"2025-03-16T01:13:59.103Z","repository":{"id":263066887,"uuid":"888619471","full_name":"jmacd/go-sampler","owner":"jmacd","description":"Prototype for OTel rule-based sampler","archived":false,"fork":false,"pushed_at":"2025-02-25T00:38:52.000Z","size":53,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-25T01:26:30.061Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jmacd.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-14T17:56:46.000Z","updated_at":"2025-02-25T00:38:53.000Z","dependencies_parsed_at":"2025-02-13T02:31:54.980Z","dependency_job_id":null,"html_url":"https://github.com/jmacd/go-sampler","commit_stats":null,"previous_names":["jmacd/go-sampler"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmacd%2Fgo-sampler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmacd%2Fgo-sampler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmacd%2Fgo-sampler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmacd%2Fgo-sampler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jmacd","download_url":"https://codeload.github.com/jmacd/go-sampler/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243809883,"owners_count":20351407,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-21T22:53:01.806Z","updated_at":"2025-03-16T01:13:59.097Z","avatar_url":"https://github.com/jmacd.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# OTel-Go Composable Sampler prototype\n\n## Summary\n\nThis is a prototype in support of [OTEP 250](https://github.com/open-telemetry/oteps/pull/250).\n\nThis is at a \"proof-of-concept\" maturity level.  Testing coverage is\ngood but intentionally not complete.\n\nAs described in the OTEP, the use of consistent probability sampling\nin complex scenarios can lead to inaccurate counting.  The machinery\nneeded to enforce consistent sampling so that it can be widely\ndeployed with reliable results is necessarily more complex.\n\nThis prototype demonstrates a solution tailored to the OTel-Go Trace\nSDK.  Pieces of this implementation are copied verbatim from that SDK\nto make the prototype useful in that context.  In particular, the\n`Sampler`, `SamplingParameters`, `SamplingResult`, and\n`SamplingDecision` types are preserved and the `AlwaysSample()` and\n`ParentBased()` constructors are copied exactly, so the benchmarks can\nbe faithful.\n\nThis prototype was initiated for several reasons, one of which being\n[questions in the previous prototype PR about\nperformance](https://github.com/open-telemetry/opentelemetry-go/pull/5645).\n\nThis implementation follows \"Approach 2\" described in the OTEP, which\nis designed to optimize the cost and structure of consistent sampling\ndecision-making.  Another prototype for the same in Java can be found\n[here](https://github.com/open-telemetry/opentelemetry-java-contrib/tree/main/consistent-sampling/src/main/java/io/opentelemetry/contrib/sampler/consistent56).\n\n### Key Contributions\n\nThe goal of this prototype is to establish a backwards-compatible API\nfor Samplers, particularly for the existing `ParentBased` Sampler.\nThis was not directly addressed in the OTEP.  The reason this is\nsomewhat challenging is that the `ParentBased` specification accepts\n`Sampler` instances to be called for unsampled contexts, and a\nconsistent (and unbiased) sampler cannot condition on the sampled\nflag.\n\nThis prototype shows how `ParentBased` can be replaced by various\ncombinations of `Composite`, `Annotating`, `RuleBased`, and\n`ParentThreshold` samplers.\n\n#### ComposableSampler\n\nThis is a new `ComposableSampler` interface as described in the OTEP.\nThis sampler returns an intent as opposed to a decision.  Sampling\nintentions can be combined, for example in an `AnyOf` sampler.\n\n#### CompositeSampler\n\nCompositeSampler constructs a `Sampler` from a `ComposableSampler`.\nThis is where the bulk of the logic involved in consistent sampling\nhappens.\n\nOne key optimization that can be found here: the `ComposableSampler`\ndoes not have the ability to modify `TraceState` the way a `Sampler`\ncan.  This enables two optimizations:\n\n1. Information about the parsed input `TraceState` can be used to\n   construct a modified output `TraceState`.  There are situations\n   where the `CompositeSampler` has to erase or modify the encoded\n   threshold, and this can re-use the potision information from the\n   parser if the `ComposableSampler` API prevents modifying TraceState.\n2. The `AnnotatingSampler` interface can _potentially_ make use of the\n   randomness value assuming it does not change as the result of a\n   `ComposableSampler`.  The example where this matters: a sampler,\n   part of an AnyOf construction, wishes to insert an attribute\n   specifically when it would sample, not necessarily when there is a\n   global decision to sample.  For a sampler to ask \"would I sample?\"\n   the randomness value needs to be accessible in the parameters and\n   cannot be mutated.\n\nThe upshot of this is that for a Root sampler to set the randomness\nvalue, it will have to be done outside of the composite samplerinterface.\n\n#### ComposableSamplingParameters\n\nThe `ComposableSamplingParameters` type combines the original\nSamplingParameters with an optimization.  The original\nSamplingParameters include the parent's `context.Context`, which\nforces Samplers to lookup the SpanContext.  In a composite sampler,\nthis could happen multiple times, so it makes sense to include\n`SpanContext` in the parameters directly.\n\nThis type includes non-exported copies of the effective incoming\nthreshold and the computed randomness value, with the following\nrationale:\n\n1. The incoming threshold value is the one that `ParentThreshold()`\n   sampler will use.  It is the only sampler that uses this field, and\n   its value is derived using inputs from the `ConsistentSampler`.\n2. The incoming randomness value can be used for a sampler to ask\n   whether it iself would sampler, see the `WouldSample(params\n   ComposableSamplingParameters) bool` API.\n\n#### TraceIdRatioBased\n\nThis aspect of the prototype is copied from the [first\nprototype](https://github.com/open-telemetry/opentelemetry-go/pull/5645).\nSee the related, pending OpenTelemetry specification work in [PR\n4166](https://github.com/open-telemetry/opentelemetry-specification/pull/4166)\nand [PR\n4162](https://github.com/open-telemetry/opentelemetry-specification/pull/4162).\n\n#### RuleBased\n\nThis is as-described in the OTEP.  This is differs slightly from the\nimplementation found in the Java prototype in removing the `SpanKind`\nargument from the rule because it can be treated as an ordinary aspect of the predicate.\n\nSee [potential optimizations discussed below](#potential-optimizations).\n\n#### ParentThreshold\n\nThis is a special built-in Sampler that makes the same decision the\nparent context did, which results in passing through consistent\nsampling thresholds correctly.  As an example, the essential function\nof `ParentBased` can be replaced as follows:\n\n```\nfunc ComposableParentBased(root ComposableSampler) ComposableSampler {\n\treturn RuleBased(\n\t\tWithRule(IsRootPredicate(), root),\n\t\tWithDefaultRule(ParentThreshold()),\n\t)\n}\n```\n\n#### AnnotatingSampler\n\nThis is a convenience implementation of `ComposableSampler` meant to\nsupport adding span attributes while using `ComposableSampler` APIs.\n\n## Benchmarks\n\nThe benchmarks here are sufficient to identify the overhead introduced\nby composable samplers and enforcing consistent sampling.  The\noriginal OTel-Go AlwaysOn and ParentBased samplers are included as a\nreference.\n\nThese benchmarks are not comprehensive, but the illusrate what can be\nachieved with this approach.  Note that this code has a fast-path\noptimization for the `AlwaysSample()` case, which is the case where an\nempty TraceContext is modifed to include `ot=th:0`.  All of the\nstandard parent-based sampling configurations, therefore, do not\nmodify TraceState and have zero memory allocations.\n\n```\ngoos: darwin\ngoarch: arm64\npkg: github.com/jmacd/sampler\nBenchmarkAlwaysOn-10                                                       \t67102522\t        19.96 ns/op\t       0 B/op\t       0 allocs/op\nBenchmarkConsistentAlwaysOn-10                                             \t26184316\t        45.24 ns/op\t       0 B/op\t       0 allocs/op\nBenchmarkComposableParentBasedUnknownThreshold-10                          \t14998062\t        72.46 ns/op\t       0 B/op\t       0 allocs/op\nBenchmarkComposableParentBasedParentThreshold-10                           \t13936856\t        85.01 ns/op\t       0 B/op\t       0 allocs/op\nBenchmarkComposableParentBasedNonEmptyTraceStateUnknownThreshold-10        \t16190498\t        75.56 ns/op\t       0 B/op\t       0 allocs/op\nBenchmarkComposableParentBasedNonEmptyOTelTraceStateUnknownThreshold-10    \t12790028\t        92.12 ns/op\t       0 B/op\t       0 allocs/op\nBenchmarkComposableParentBasedNonEmptyOTelTraceStateParentThreshold-10     \t 8466374\t       138.9 ns/op\t       0 B/op\t       0 allocs/op\nBenchmarkParentBasedNoTraceState-10                                        \t20660518\t        56.45 ns/op\t       0 B/op\t       0 allocs/op\nBenchmarkParentBasedWithOTelTraceStateIncludingRandomness-10               \t21257059\t        65.72 ns/op\t       0 B/op\t       0 allocs/op\n```\n\n## Areas not explored\n\n### Potential optimizations\n\nThe difference between the Java and Go prototypes comes down to\nhandling SpanKind as a special parameter, or not.  There are standing\nrequests in OpenTelemetry to incorporate the Resouce and Scope value\ninto the sampling decision.\n\nThere are potentially a number of directions to explore here,\nincluding ways to compile aggregate sampler behavior with\noptimizations.  For example, a RuleBased sampler can be re-organized\nto have one set of rules per span kind.  Similarly, composable sampler\npredicates could be pre-evaluated in terms of resource and scope\nattributes that are available, in order to lower the cost of sampler\nevaluation.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjmacd%2Fgo-sampler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjmacd%2Fgo-sampler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjmacd%2Fgo-sampler/lists"}