https://github.com/rexagod/cpv
A utility to help teams make the most out of CMO's Collection Profiles featureset.
https://github.com/rexagod/cpv
metrics openshift
Last synced: 7 months ago
JSON representation
A utility to help teams make the most out of CMO's Collection Profiles featureset.
- Host: GitHub
- URL: https://github.com/rexagod/cpv
- Owner: rexagod
- License: gpl-3.0
- Created: 2023-07-18T17:12:37.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2023-11-20T09:47:57.000Z (over 2 years ago)
- Last Synced: 2024-12-27T10:09:24.991Z (over 1 year ago)
- Topics: metrics, openshift
- Language: Go
- Homepage: https://github.com/openshift/enhancements/pull/1298
- Size: 4.01 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# `cpv`: Collection Profiles Validator
[](https://github.com/rexagod/cpv/actions) [](https://goreportcard.com/report/github.com/rexagod/cpv) [](https://pkg.go.dev/github.com/rexagod/cpv)
`cpv` is a command line tool for working with [collection profiles](https://github.com/openshift/enhancements/blob/719b231e3b06cf274e77f0d89e46a0d258002572/enhancements/monitoring/scrape-profiles.md?plain=1).
## Usage
`cpv` expects the following set of flags.
```
Usage of ./cpv:
-address string
Address of the Prometheus instance. (default "http://localhost:9090")
-allow-list-file string
Path to a file containing a list of allow-listed metrics that will always be included within the extracted metrics set. Requires -profile flag to be set.
-bearer-token string
Bearer token for authentication.
-kubeconfig string
Path to kubeconfig file. Defaults to $KUBECONFIG.
-noisy
Enable noisy assumptions: interpret the absence of the collection profiles label as the default 'full' profile (when using the -status flag).
-output-cardinality
Output cardinality of all extracted metrics to a file.
-profile string
Collection profile that the command is being run for.
-quiet
Suppress all output, and use $EDITOR for generated manifests.
-rule-file string
Path to a valid rule file to extract metrics from, for eg., https://github.com/prometheus/prometheus/blob/v0.45.0/model/rulefmt/testdata/test.yaml. Requires -profile flag to be set.
-status
Report collection profiles' implementation status. -profile may be empty to report status for all profiles.
-target-selectors string
Target selectors used to extract metrics, for eg., https://github.com/prometheus/client_golang/blob/644c80d1360fb1409a3fe8dfc5bad4228f282f3b/api/prometheus/v1/api_test.go#L1007. Requires -profile flag to be set.
-validate
Validate the collection profile implementation. Requires -profile flag to be set.
-version
Print version information.
```
### Scenarios
While the utility can be used with the various aforementioned flag combinations to fulfill the desired use-case, the following ones may comparatively be more prominent within the general workflow and thus, have been documented in order to get the developers up-and-running with in no time.
#### Extraction
The utility can be used to extract metrics based a set of given parameters that include:
* `-allow-list-file`: Path to a file containing a list of metrics that will always be included within the extracted metrics set, even if they are not present in the Prometheus instance forwarded at `-address`.
* `-rule-file`: Path to a file containing a set of [`RuleGroup`](https://github.com/prometheus/client_golang/blob/v1.17.0/api/prometheus/v1/api.go#L569)s. All metrics used to define `expr`essions within the `rules` will be extracted. For example, [`model/rulefmt/testdata/test.yaml`](https://github.com/prometheus/prometheus/blob/v0.45.0/model/rulefmt/testdata/test.yaml) will result in the extraction of two metrics: `errors_total` and `requests_total`.
* `-target-selectors`: A set of constraints (resembling [`VectorSelector`](https://github.com/prometheus/prometheus/blob/32ee1b15de6220ab975f3dac7eb82131a0b1e95f/promql/parser/ast.go#L126)s) satisfying the `matchTarget` parameter in [`TargetsMetadata`](https://github.com/prometheus/client_golang/blob/0356577e9b46283f8efae268b73ffee773a6feb7/api/prometheus/v1/api.go#L501). For example. `"{job=\"prometheus\", severity=\"critical\"}"` will result in the extraction of all metrics present in the Prometheus instance forwarded at `-address`, that have the `job` label set to `prometheus` and the `severity` label set to `critical`.
All these flags are mutually exclusive and require the `-profile` flag to be set. Once extracted, the metrics are used to generate a [`RelabelConfig`](https://github.com/prometheus-operator/prometheus-operator/blob/pkg/apis/monitoring/v0.66.0/pkg/apis/monitoring/v1/prometheus_types.go#L1267) that [can be dropped into the `ServiceMonitor` or `PodMonitor` resource](https://github.com/openshift/cluster-monitoring-operator/pull/1785/files#diff-2ced247f66ba1c3c56d30d7ae8c78af6a5eb5e561060d5d64f5caa4cd42626b9R15).
```bash
$ ./cpv -profile="$PROFILE" -rule-file="$RULE_FILE" -target-selectors="$TARGET_SELECTORS" -allow-list-file="$ALLOW_LIST_FILE"
```
```yaml
sourcelabels:
- __name__
separator: ""
targetlabel: ""
regex: (foo|bar|...)
modulus: 0
replacement: ""
action: keep
```
Additionally, `-output-cardinality` may be specified to output the cardinality of all extracted metrics to a file, in order to better assess decisions around keeping or dropping certain metrics within the `ServiceMonitor` or `PodMonitor` resource(s) for a particular profile.
```
METRIC CARDINALITY
foo 40
bar 10
...
```
#### Status
The utility can be used to evaluate the extent to which a collection profile has been implemented for every default `ServiceMonitor` or `PodMonitor` resource that has [opted-in to Collection Profiles feature](https://github.com/rexagod/cpv/blob/74ff86c9a7f99635b40f991efc6eb14c859bb496/internal/profiles/utils.go#L48). For example, with respect to the [`default` Kube State Metrics `ServiceMonitor`](https://github.com/JoaoBraveCoding/cluster-monitoring-operator/blob/ad0a06d61793336a7d520cb37d48a053b1b233d1/assets/kube-state-metrics/service-monitor.yaml#L9) (notice the explicit opt-in label), the utility, seeing that this has opted-in to the Collection Profiles feature, will check for the presence of all corresponding [`SupportedNonDefaultCollectionProfiles`](https://github.com/rexagod/cpv/blob/373d577560bae10f10769aeeab33781df7d4dc8f/internal/profiles/types.go#L24) for that `ServiceMonitor` and report the status for each of them (whether they exist or not).
For **all** profiles to be "fully implemented" (i.e., when `-status` is used without specifying a particular `-profile=$PROFILE`) all of the default opted-in `ServiceMonitor` or `PodMonitor` resources (i.e., with `monitoring.openshift.io/collection-profile` label set to `full`) must have the same corresponding resources for every such profile. Here, "corresponding resources" mean the `ServiceMonitor` or `PodMonitor` resources that have their `metadata.name` same as their default opted-in `ServiceMonitor` or `PodMonitor` resource counterpart appended by the profile they fulfill, and with the `monitoring.openshift.io/collection-profile` label set to the profile being checked for.
So, for example, for an opted-in default `ServiceMonitor` resource with `metadata.name` as `kube-state-metrics` and `monitoring.openshift.io/collection-profile: full` present within its label set, the corresponding `ServiceMonitor` resources for the, say, `minimal` profile would be `kube-state-metrics-minimal`. The utility will check for the presence of all corresponding resources for every profile with the default resources' `metadata.name` as the base and report the status for each of them.
```bash
$ ./cpv -profile="$PROFILE" -status
```
```
PROFILE SERVICE MONITOR POD MONITOR ERROR
$PROFILE foo-monitor not implemented
$PROFILE bar-monitor not implemented
...
```
Additionally, a `-noisy` flag may be specified to interpret the absence of `monitoring.openshift.io/collection-profile: full` within the default `ServiceMonitor` or `PodMonitor` resources as the default `full` profile. This is useful when the `ServiceMonitor` or `PodMonitor` resources have not been updated to opt-in to the Collection Profiles feature yet.
#### Validation
The utility can be used to validate against any discrepancies that impact the specified `ServiceMonitor` or `Podmonitor` resources. For this purpose, the utility expects the `-profile` flag, i.e, the profile that the validation should run against, and the `-validate` flag to be set. The validation works by reporting the hierarchy of any missing metrics that the specified `-profile` depends on, the absence of which in turn may end up impacting the resources dependent on those metrics.
```bash
$ ./cpv -profile="$PROFILE" -validate
```
```
$PROFILE MONITOR GROUP LOCATION RULE QUERY METRIC ERROR
etcd-minimal etcd .../openshift-etcd-operator-etcd-prometheus-rules-....yaml etcdMemberCommunicationSlow histogram_quantile(0.99, rate(etcd_network_peer_round_trip_time_seconds_bucket{job=~".*etcd.*"}[5m])) > 0.15 etcd_network_peer_round_trip_time_seconds_bucket not loaded
...
```
## License
[GNU GPLv3](LICENSE)