Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/hewlettpackard/prometheus-parser-rs

a Rust library for parsing and validating Prometheus query expressions
https://github.com/hewlettpackard/prometheus-parser-rs

linter parser prometheus rust

Last synced: 3 months ago
JSON representation

a Rust library for parsing and validating Prometheus query expressions

Awesome Lists containing this project

README

        

# prometheus-parser-rs

[![docs.rs](https://docs.rs/prometheus-parser/badge.svg)](https://docs.rs/prometheus-parser/)

prometheus-parser parses arbitrary Prometheus expressions into a syntax tree
appropriate for syntax checking, linting, and general static analysis. It does
not depend on a Prometheus server and works entirely offline.

This library is implemented in pure Rust (using [pest]) with no knowledge of
the underlying Prometheus server implementation. It's been validated against
the public Prometheus syntax documentation and various publicly available alert
collections (e.g. [kubernetes-mixin]).

Note that this crate doesn't try to _evaluate_ Prometheus expressions; consider
using [`promtool`'s unit testing][unit] for this purpose.

[pest]: https://pest.rs/
[unit]: https://prometheus.io/docs/prometheus/latest/configuration/unit_testing_rules/
[kubernetes-mixin]: https://github.com/kubernetes-monitoring/kubernetes-mixin

## Use cases

This library can be used to implement custom linting and static analysis tools,
for example:

* Offline syntax checking with useful error messages (thanks to [pest])
* Automatic code formatting
* Various AST inspections:
* Ensuring expressions don't drop required labels for e.g. alert routing
(namespace, service, etc)
* Ensuring expressions don't make common mistakes that can cause spurious
alerts, e.g. division by zero with counters

## Limitations

* It doesn't try to evaluate expressions at all
* It doesn't validate function calls or argument types, so `foo(bar, baz, qux)`
is considered at least _syntactically_ valid. Each function argument still
needs to parse to a real expression, though.
* The `return_value()` utility will raise soft errors on certain type errors
(via `ReturnKind::Unknown`)
* Note that Prometheus performs some extra query validation based on runtime
data and will return errors on e.g. label mismatches that we can't catch.
* It properly parses most obscure elements of Prometheus syntax, however it's
not entirely lossless; equivalent expressions (e.g. `sum by (a) (foo{})` and
`sum (foo) by (a)`) will be evaluated identically, so expect minor output
differences if you use this to implement a formatter.
* It probably doesn't parse entirely the same as the real Prometheus parser,
though care was taken to match it as closely as possible. That is, this
parser may consider expressions valid that Prometheus won't, or vice versa.
Please consider filing bugs for any differences.

## Return value prediction

The library has a utility for predicting the return types of expressions based
only on the syntax tree. This may be useful for rough static analysis.

* `Expression::return_value()` recursively computes an expected return type
(scalar, instant vector, range vector, etc) along with a set of label
operations. If for some reason this fails, `ReturnKind::Unknown` is returned
with a message explaining why.
* `ReturnValue::passthrough(input_labels: &[&str])` can then be used predict
which output labels should be present for a given query. It accepts a list of
initial label names assumed to be present on the metrics (e.g. expected
metrics like `namespace`) and returns a set of expected output labels.

Note that additional labels may be returned: if expressions explicitly select
for labels, they'll be returned as well; an empty list of initial labels can be
used to see what extra labels an expression will query for.

* `ReturnValue::drops(label: &str)` determines which AST node, if any, dropped
the given input label.

## Examples

### [`examples/parse.rs`](./examples/parse.rs)

This can be used to dump the (slightly reduced) syntax tree for a given
expression.

Try `cargo run --example parse 'foo'`:

```bash
$ cargo run --example parse 'count_values
by(service) ("config_hash", alertmanager_config_hash{job="alertmanager-main",namespace="monitoring"})
/ on(service) group_left() label_replace(max by(name, job, namespace, controller)
(prometheus_operator_spec_replicas{controller="alertmanager",job="prometheus-operator",namespace="monitoring"}),
"service", "alertmanager-$1", "name", "(.*)") !=
1'

Operator {
kind: NotEqual,
lhs: Operator {
kind: Divide,
lhs: Function {
name: "count_values",
args: [
# ... snip ...
],
aggregation: Some(
Aggregation {
op: By,
labels: [
"service",
],
},
),
},
rhs: Function {
name: "label_replace",
args: [
# ... snip ...
],
aggregation: None,
},
matching: Some(
Matching {
op: On,
labels: [
"service",
],
group: Some(
MatchingGroup {
op: Left,
labels: [],
},
),
},
),
},
rhs: 1.0,
matching: None,
}
```

Note that `Expression` has a `Debug` impl that somewhat reduces boilerplate
output. It makes debug output more readable, but the displayed object structure
won't exactly match the in-memory version. Specifically, if constructing your
own AST objects, use the `.wrap()` utility functions present on most
`Expression` subtypes to convert them to an `Expression`.

### [`examples/reformat.rs`](./examples/reformat.rs)

Parses the input expression and reformats it using the default `Display` impl.

```bash
$ cargo run --example reformat 'sum by(bar)(foo)'
sum(foo) by (bar)
```

### [`examples/return_value.rs`](./examples/return_value.rs)

Displays the raw `ReturnValue` computed for a given expression.

```bash
$ cargo run --example return_value 'foo(sum(foo))'
ReturnValue {
kind: InstantVector,
label_ops: [
LabelSetOpTuple {
op: Clear,
expression: Function {
name: "sum",
# ... snip ...
},
span: Some(
Span {
start: 4,
end: 12,
},
),
},
],
}

```

### [`examples/label_drop.rs`](./examples/label_drop.rs)

Shows where a label is dropped in an expression, if at all.

```bash
$ cargo run --example label_drop 'foo(sum(foo))' namespace
label 'namespace' is dropped:

foo(sum(foo))
--------

parent expression: sum(foo)
```

If an expression is invalid and has an unknown return type, a message will be
printed:

```bash
$ cargo run --example label_drop 'foo(sum(foo)) / bar[5m:]' namespace
note: expression return type is unknown, result may be inaccurate
reason: rhs return type (RangeVector) is not valid in an operator
expression: foo(sum(foo)) / bar[5m:]

label 'namespace' is not dropped
```

### [`examples/label_passthrough.rs`](./examples/label_passthrough.rs)

Shows the expected output labels for a given expression. Arguments after the
expression are labels that exist on all input metrics.

Additional labels may be returned that were not present in the input list; these
were inferred based on labels referenced in the expression.

```bash
$ cargo run --example label_passthrough 'sum by(bar)(foo)' baz
{
"bar",
}
```

## Contributing

Bug reports, feature requests, and pull requests are welcome! Be sure to read
though the [code of conduct] for some pointers to get started.

Note that - as mentioned in the code of conduct - code contributions must
indicate that you accept the [Developer Certificate of Origin][dco],
essentially asserting you have the necessary rights to submit the code you're
contributing under the project's license (MIT). If you agree, simply pass `-s`
to `git commit`:

```bash
git commit -s [...]
```

... and Git will automatically append the required `Signed-off-by: ...` to the
end of your commit message.

[code of conduct]: ./CODE_OF_CONDUCT.md
[dco]: https://developercertificate.org/