Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/blind-oracle/cortex-tenant
Prometheus remote write proxy that adds Cortex/Mimir tenant ID based on metric labels
https://github.com/blind-oracle/cortex-tenant
cortex cortex-tenant kubernetes labels metrics mimir prometheus proxy tenant timeseries
Last synced: 3 months ago
JSON representation
Prometheus remote write proxy that adds Cortex/Mimir tenant ID based on metric labels
- Host: GitHub
- URL: https://github.com/blind-oracle/cortex-tenant
- Owner: blind-oracle
- License: mpl-2.0
- Created: 2020-10-06T16:52:25.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2024-10-18T10:16:16.000Z (3 months ago)
- Last Synced: 2024-10-19T14:08:30.575Z (3 months ago)
- Topics: cortex, cortex-tenant, kubernetes, labels, metrics, mimir, prometheus, proxy, tenant, timeseries
- Language: Go
- Homepage:
- Size: 355 KB
- Stars: 109
- Watchers: 5
- Forks: 58
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-go - cortex-tenant - Prometheus remote write proxy that adds add Cortex tenant ID header based on metric labels. (Server Applications / HTTP Clients)
- zero-alloc-awesome-go - cortex-tenant - Prometheus remote write proxy that adds add Cortex tenant ID header based on metric labels. (Server Applications / HTTP Clients)
- awesome-go-extra - cortex-tenant - 10-06T16:52:25Z|2022-06-09T08:49:48Z| (Server Applications / HTTP Clients)
README
# cortex-tenant
[![Go Report Card](https://goreportcard.com/badge/github.com/blind-oracle/cortex-tenant)](https://goreportcard.com/report/github.com/blind-oracle/cortex-tenant)
[![Coverage Status](https://coveralls.io/repos/github/blind-oracle/cortex-tenant/badge.svg?branch=main)](https://coveralls.io/github/blind-oracle/cortex-tenant?branch=main)
[![Artifact Hub](https://img.shields.io/endpoint?url=https://artifacthub.io/badge/repository/cortex-tenant)](https://artifacthub.io/packages/helm/cortex-tenant/cortex-tenant)Prometheus remote write proxy which marks timeseries with a Cortex/Mimir tenant ID based on labels.
## Architecture
![Architecture](architecture.svg)
## Overview
Cortex/Mimir tenants (separate namespaces where metrics are stored to and queried from) are identified by `X-Scope-OrgID` HTTP header on both writes and queries.
~~Problem is that Prometheus can't be configured to send this header~~ Actually in some recent version (year 2021 onwards) this functionality was added, but the tenant is the same for all jobs. This makes it impossible to use a single Prometheus (or an HA pair) to write to multiple tenants.
This software solves the problem using the following logic:
- Receive Prometheus remote write
- Search each timeseries for a specific label name and extract a tenant ID from its value.
If the label wasn't found then it can fall back to a configurable default ID.
If none is configured then the write request will be rejected with HTTP code 400
- Optionally removes this label from the timeseries
- Groups timeseries by tenant
- Issues a number of parallel per-tenant HTTP requests to Cortex/Mimir with the relevant tenant HTTP header (`X-Scope-OrgID` by default)## Usage
- Get `rpm` or `deb` for amd64 from the Releases page. For building see below.
### HTTP Endpoints
- GET `/alive` returns 200 by default and 503 if the service is shutting down (if `timeout_shutdown` setting is > 0)
- POST `/push` receives metrics from Prometheus - configure remote write to send here### Configuration
The service can be configured by a config file and/or environment variables. Config file may be specified by passing `-config` CLI argument.
If both are used then the env vars have precedence (i.e. they override values from config).
See below for config file format and corresponding env vars.```yaml
# Where to listen for incoming write requests from Prometheus
# env: CT_LISTEN
listen: 0.0.0.0:8080# Profiling API, remove to disable
# env: CT_LISTEN_PPROF
listen_pprof: 0.0.0.0:7008# Where to send the modified requests (Cortex/Mimir)
# env: CT_TARGET
target: http://127.0.0.1:9091/receive# Whether to enable querying for IPv6 records
# env: CT_ENABLE_IPV6
enable_ipv6: false# This parameter sets the limit for the count of outgoing concurrent connections to Cortex / Mimir.
# By default it's 64 and if all of these connections are busy you will get errors when pushing from Prometheus.
# If your `target` is a DNS name that resolves to several IPs then this will be a per-IP limit.
# env: CT_MAX_CONNS_PER_HOST
max_conns_per_host: 0# Authentication (optional)
auth:
# Egress HTTP basic auth -> add `Authentication` header to outgoing requests
egress:
# env: CT_AUTH_EGRESS_USERNAME
username: foo
# env: CT_AUTH_EGRESS_PASSWORD
password: bar# Log level
# env: CT_LOG_LEVEL
log_level: warn# HTTP request timeout
# env: CT_TIMEOUT
timeout: 10s# Timeout to wait on shutdown to allow load balancers detect that we're going away.
# During this period after the shutdown command the /alive endpoint will reply with HTTP 503.
# Set to 0s to disable.
# env: CT_TIMEOUT_SHUTDOWN
timeout_shutdown: 10s# Max number of parallel incoming HTTP requests to handle
# env: CT_CONCURRENCY
concurrency: 10# Whether to forward metrics metadata from Prometheus to Cortex/Mimir
# Since metadata requests have no timeseries in them - we cannot divide them into tenants
# So the metadata requests will be sent to the default tenant only, if one is not defined - they will be dropped
# env: CT_METADATA
metadata: false# If true response codes from metrics backend will be logged to stdout. This setting can be used to suppress errors
# which can be quite verbose like 400 code - out-of-order samples or 429 on hitting ingestion limits
# Also, those are already reported by other services like Cortex/Mimir distributors and ingesters
# env: CT_LOG_RESPONSE_ERRORS
log_response_errors: true# Maximum duration to keep outgoing connections alive (to Cortex/Mimir)
# Useful for resetting L4 load-balancer state
# Use 0 to keep them indefinitely
# env: CT_MAX_CONN_DURATION
max_connection_duration: 0s# Address where metrics are available
# env: CT_LISTEN_METRICS_ADDRESS
listen_metrics_address: 0.0.0.0:9090# If true, then a label with the tenant’s name will be added to the metrics
# env: CT_METRICS_INCLUDE_TENANT
metrics_include_tenant: truetenant:
# Which label to look for the tenant information
# env: CT_TENANT_LABEL
label: tenant# List of labels examined for tenant information. If set takes precedent over `label`
# env: CT_TENANT_LABEL_LIST
label_list:
- tenant
- other_tenant# Whether to remove the tenant label from the request
# env: CT_TENANT_LABEL_REMOVE
label_remove: true
# To which header to add the tenant ID
# env: CT_TENANT_HEADER
header: X-Scope-OrgID# Which tenant ID to use if the label is missing in any of the timeseries
# If this is not set or empty then the write request with missing tenant label
# will be rejected with HTTP code 400
# env: CT_TENANT_DEFAULT
default: foobar# Enable if you want all metrics from Prometheus to be accepted with a 204 HTTP code
# regardless of the response from upstream. This can lose metrics if Cortex/Mimir is
# throwing rejections.
# env: CT_TENANT_ACCEPT_ALL
accept_all: false# Optional prefix to be added to a tenant header before sending it to Cortex/Mimir.
# Make sure to use only allowed characters:
# https://grafana.com/docs/mimir/latest/configure/about-tenant-ids/
# env: CT_TENANT_PREFIX
prefix: foobar-# If true will use the tenant ID of the inbound request as the prefix of the new tenant id.
# Will be automatically suffixed with a `-` character.
# Example:
# Prometheus forwards metrics with `X-Scope-OrgID: Prom-A` set in the inbound request.
# This would result in the tenant prefix being set to `Prom-A-`.
# https://grafana.com/docs/mimir/latest/configure/about-tenant-ids/
# env: CT_TENANT_PREFIX_PREFER_SOURCE
prefix_prefer_source: false
```### Prometheus configuration example
```yaml
remote_write:
- name: cortex_tenant
url: http://127.0.0.1:8080/pushscrape_configs:
- job_name: job1
scrape_interval: 60s
static_configs:
- targets:
- target1:9090
labels:
tenant: foobar- job_name: job2
scrape_interval: 60s
static_configs:
- targets:
- target2:9090
labels:
tenant: deadbeef
```This would result in `job1` metrics ending up in the `foobar` tenant in Cortex/Mimir and `job2` in `deadbeef`.
## Building
`make build` should create you an _amd64_ binary.
If you want `deb` or `rpm` packages then install [FPM](https://fpm.readthedocs.io) and then run `make rpm` or `make deb` to create the packages.
## Containerization
To use the current container you need to overwrite the default configuration file, mount your configuration into to `/data/cortex-tenant.yml`.
You can overwrite the default config by starting the container with:
```bash
docker container run \
-v :/data/cortex-tenant.yml \
ghcr.io/blind-oracle/cortex-tenant:1.6.1
```... or build your own Docker image:
```Dockerfile
FROM ghcr.io/blind-oracle/cortex-tenant:1.6.1
ADD my-config.yml /data/cortex-tenant.yml
```### Deploy on Kubernetes
#### Using manifests
[`deploy/k8s/manifests`](deploy/k8s/manifests) directory contains the deployment, service and configmap manifest files for deploying this on Kubernetes. You can overwrite the default config by editing the configuration parameters in the configmap manifest.
```bash
kubectl apply -f deploy/k8s/manifests/cortex-tenant-deployment.yaml
kubectl apply -f deploy/k8s/manifests/cortex-tenant-service.yaml
kubectl apply -f deploy/k8s/manifests/config-file-configmap.yml
```#### Using a Helm Chart
[`deploy/k8s/chart`](deploy/k8s/chart) directory contains a chart for deploying this on Kubernetes.
```bash
helm repo add cortex-tenant https://blind-oracle.github.io/cortex-tenant
helm install cortex-tenant cortex-tenant/cortex-tenant
```You can use [`deploy/k8s/chart/testing`](deploy/k8s/chart/testing) directory to test the deployment using helmfile.
```bash
helmfile -f deploy/k8s/chart/testing/helmfile.yaml template
helmfile -f deploy/k8s/chart/testing/helmfile.yaml apply
```#### Updating chart version
```bash
helm package ./deploy/k8s/chart -d docs
TZ=UTC helm repo index ./docs
```