Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/outerbounds/metaflow-measure
Measure metrics in Metaflow and send them to Datadog and other backends
https://github.com/outerbounds/metaflow-measure
Last synced: 4 days ago
JSON representation
Measure metrics in Metaflow and send them to Datadog and other backends
- Host: GitHub
- URL: https://github.com/outerbounds/metaflow-measure
- Owner: outerbounds
- License: apache-2.0
- Created: 2024-05-13T04:48:16.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-05-21T22:35:02.000Z (6 months ago)
- Last Synced: 2024-05-21T23:35:17.694Z (6 months ago)
- Language: Python
- Size: 43 KB
- Stars: 0
- Watchers: 4
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Metrics and Measurements in Metaflow steps
This extension introduces a `measure` API that allows you to send custom
metrics and measure execution times in your Metaflow steps.## Key features
- Very simple instrumentation API: Measure your code with a few
lines of Python.- Separation between the instrumentation API and the metrics
backends: Instrument your code with the `measure` API,
record metrics locally during development, and enable a
production backend like Datadog during deployment-time.- Works locally and on `@kubernetes` and `@batch` with no
changes in the code.- Native integration with Metaflow: Metrics are tagged
with Metaflow run ID, step names, `@project` branches
etc. so you can drill into details.- Works at scale: Uses aggregators like `dogstatsd` to
avoid overloading backend APIs.- No extra dependencies: `@datadog` installs the
`dogstatsd` on the fly, so it works in any execution
environment.## Installation
In your development environment, install
```
pip install metaflow-measure
```
Note that you don't need to make the package available in
container images you use to execute tasks remotely. Metaflow
packages the extension automatically for remote execution.## API
The `measure` module exposes [`statsd`-style measurement
functions](https://docs.datadoghq.com/metrics/custom_metrics/dogstatsd_metrics_submission/):
`gauge`, `increment`, and `decrement`.Optionally, all `measure` functions take a keyword argument
`tags` which takes a list of custom tags (strings) to be
associated with the measurement.### Basic Metrics
```python
from metaflow.plugins import measure# record a gauge metric
measure.gauge('mymetric', value)# increment a metric
measure.increment('mymetric', value)# decrement a metric
measure.decrement('mymetric', value)
```### Distributions
In addition, `measure` provides `distribution` which allows
you to measure distributions of values (e.g. p50, p95 etc) relying
on server-side aggregation for accuracy.```python
from metaflow.plugins import measure# record a distribution metric
measure.distribution('mydistribution', value)```
For convenience, the API provides a context manager that
allows you to measure the execution time of a code block easily.```python
from metaflow.plugins import measurewith measure.TimeDistribution('mytiming'):
some_time_consuming_function()
```# Supported Backends
Currently, the following backends are supported
## `@datadog`
Add `@datadog` in your steps to send measurements to Datadog.
Typically, you would instrument your code with `measure` and
then enable Datadog on the fly with```
python measureflow.py run --with datadog:api_key=$API_KEY
```### Authentication
You can provide the `api_key` in the decorator
```
@datadog(api_key=MY_KEY)
```
or on the command line
```
--with datadog:api_key=$API_KEY
```
or set the environment variable `DD_API_KEY` via `@secrets` or
`@environment`.### Tags
The `@datadog` decorator adds various Metaflow-related tags
to all metrics, prefixed with `metaflow_`. You can disable
this with
```
@datadog(include_metaflow_tags=False)
```
and/or set custom tags to be associated with all measurements as
```
@datadog(tags=['mytag'])
```### Debugging
To debug connectivity issues in Datadog, set
```
@datadog(verbose=True, debug_daemon=True)
```## Example
Run the following flow to see the extension in action.
```python
import os, timefrom metaflow import FlowSpec, step, datadog
from metaflow.plugins import measureclass MeasureFlow(FlowSpec):
@datadog
@step
def start(self):
for i in range(10):
measure.increment('mftest.test_metric')
time.sleep(1)
with measure.TimeDistribution('mftest.slow_operation', tags=['custom_tag']):
time.sleep(10)
self.next(self.end)@step
def end(self):
# this metric is not sent anywhere by default,
# unless you add @datadog or another backend
measure.gauge('mftest.my_gauge', 42)if __name__ == '__main__':
MeasureFlow()
```If you run this flow as
```
export DD_API_KEY=my_datadog_key
python measureflow.py run
```
only the measurements from the `start` step will be sent to Datadog,
thanks to the `@datadog` decorator. The `end` step executes with `measure`
calls, but they are not sent anywhere as no backend has been configured
for the step.To test sending all metrics to Datadog, add `@datadog` to the `end` step
or run the flow as
```
python measureflow.py run --with datadog
```Test the code in the cloud
```
python measureflow.py run --with kubernetes --with datadog:api_key=$DD_API_KEY
```
(or `--with batch`)