{"id":15062643,"url":"https://github.com/stolostron/multicluster-observability-operator","last_synced_at":"2025-05-16T04:03:41.813Z","repository":{"id":37256089,"uuid":"250143965","full_name":"stolostron/multicluster-observability-operator","owner":"stolostron","description":"Operator for Multi-Cluster Monitoring with Thanos.","archived":false,"fork":false,"pushed_at":"2025-05-10T10:05:40.000Z","size":46304,"stargazers_count":132,"open_issues_count":50,"forks_count":77,"subscribers_count":15,"default_branch":"main","last_synced_at":"2025-05-10T11:19:12.987Z","etag":null,"topics":["grafana","kubernetes","monitoring","observability","open-cluster-management","openshift-operator","prometheus","thanos"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/stolostron.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-03-26T02:41:14.000Z","updated_at":"2025-04-30T10:52:39.000Z","dependencies_parsed_at":"2023-11-20T22:27:06.659Z","dependency_job_id":"76e3ad22-b74f-4d66-b19f-4bf99b10c54e","html_url":"https://github.com/stolostron/multicluster-observability-operator","commit_stats":null,"previous_names":["open-cluster-management/multicluster-monitoring-operator","open-cluster-management/multicluster-observability-operator"],"tags_count":43,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stolostron%2Fmulticluster-observability-operator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stolostron%2Fmulticluster-observability-operator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stolostron%2Fmulticluster-observability-operator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stolostron%2Fmulticluster-observability-operator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/stolostron","download_url":"https://codeload.github.com/stolostron/multicluster-observability-operator/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254464891,"owners_count":22075570,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["grafana","kubernetes","monitoring","observability","open-cluster-management","openshift-operator","prometheus","thanos"],"created_at":"2024-09-24T23:44:02.073Z","updated_at":"2025-05-16T04:03:41.786Z","avatar_url":"https://github.com/stolostron.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Observability Overview\n\n[![Build](https://img.shields.io/badge/build-Prow-informational)](https://prow.ci.openshift.org/?repo=stolostron%2F${multicluster-observability-operator})\n[![Quality Gate Status](https://sonarcloud.io/api/project_badges/measure?project=stolostron_multicluster-observability-operator\u0026metric=alert_status\u0026token=3452dcca82a98e4aa297c1b31fd21939288db4c0)](https://sonarcloud.io/dashboard?id=stolostron_multicluster-observability-operator)\n\nThis document attempts to explain how the different components in Open Cluster Management Observabilty come together to deliver multicluster fleet observability. We do leverage several open source projects: [Grafana](https://github.com/grafana/grafana), [Alertmanager](https://github.com/prometheus/alertmanager), [Thanos](https://github.com/thanos-io/thanos/), [Observatorium Operator and API Gateway](https://github.com/observatorium), [Prometheus](https://github.com/prometheus/prometheus); We also leverage a few [Open Cluster Mangement projects](https://open-cluster-management.io/) namely - [Cluster Manager or Registration Operator](https://github.com/stolostron/registration-operator), [Klusterlet](https://github.com/stolostron/registration-operator). The multicluster-observability operator is the root operator which pulls in all things needed.\n\n## Conceptual Diagram\n\n![Conceptual Diagram of the Components](docs/images/observability_overview_in_ocm.png)\n\n## Associated Github Repositories\n\nComponent |Git Repo | Description\n---  | ------ | ----  \nMCO Operator | [multicluster-observability-operator](https://github.com/stolostron/multicluster-observability-operator) | Operator for monitoring. This is the root repo. If we follow the Readme instructions here to install, the code from all other repos mentioned below are used/referenced.\nEndpoint Operator | [endpoint-metrics-operator](https://github.com/stolostron/multicluster-observability-operator/tree/main/operators/endpointmetrics) | Operator that manages  setting up observability and data collection at the managed clusters.\nObservatorium Operator | [observatorium-operator](https://github.com/stolostron/observatorium-operator) | Operator to deploy the Observatorium project. Inside the open cluster management, at this time, it means metrics using Thanos. Forked from main observatorium-operator repo.\nMetrics collector | [metrics-collector](https://github.com/stolostron/multicluster-observability-operator/tree/main/collectors/metrics) | Scrapes metrics from Prometheus at managed clusters, the metric collection being shaped by configuring allow-list.\nRBAC Proxy | [rbac_query_proxy](https://github.com/stolostron/multicluster-observability-operator/tree/main/proxy) | Helper service that acts a multicluster metrics RBAC proxy.\nGrafana | [grafana](https://github.com/stolostron/grafana) | Grafana repo -  for  dashboarding and metric analytics. Forked from main grafana repo.\nDashboard Loader | [grafana-dashboard-loader](https://github.com/stolostron/multicluster-observability-operator/tree/main/loaders/dashboards) | Sidecar proxy to load grafana dashboards from configmaps.\nManagement Ingress | [management-ingress](https://github.com/stolostron/management-ingress) | NGINX based ingress controller to serve Open Cluster Management services.\nObservatorium API | [observatorium](https://github.com/stolostron/observatorium) | API Gateway which controls reading, writing of the Observability data to the backend infrastructure. Forked from main observatorium API repo.\nThanos Ecosystem | [kube-thanos](https://github.com/stolostron/kube-thanos) | Kubernetes specific configuration for deploying Thanos. The observatorium operator leverages this configuration to deploy the backend Thanos components.\n\n## Quick Start Guide\n\n### Prerequisites\n\n* Ensure [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl) and [kustomize](https://kubectl.docs.kubernetes.io/installation/kustomize/) are installed.\n* Prepare a OpenShift cluster to function as the hub cluster.\n* Ensure [docker 17.03+](https://docs.docker.com/get-started) is installed.\n* Ensure [golang 1.15+](https://golang.org/doc/install) is installed.\n* Ensure [operator-sdk 1.4.2+](https://github.com/operator-framework/operator-sdk) in installed.\n* Ensure the open-cluster-management cluster manager is installed. See [Cluster Manager](https://open-cluster-management.io/getting-started/core/cluster-manager/) for more information.\n* Ensure the `open-cluster-management` _klusterlet_ is installed. See [Klusterlet](https://open-cluster-management.io/getting-started/core/register-cluster/) for more information.\n\n\u003e Note: By default, the API conversion webhook use on the OpenShift service serving certificate feature to manage the certificate, you can replace it with cert-manager if you want to run the multicluster-observability-operator in a kubernetes cluster.\n\nUse the following quick start commands for building and testing the multicluster-observability-operator:\n\n### Clone the Repository\n\nCheck out the multicluster-observability-operator repository.\n\n```bash\ngit clone git@github.com:stolostron/multicluster-observability-operator.git\ncd multicluster-observability-operator\n```\n\n### Build the Operator\n\nBuild the multicluster-observability-operator image and push it to a public registry, such as quay.io:\n\n```bash\nmake docker-build docker-push IMG=quay.io/\u003cYOUR_USERNAME_IN_QUAY\u003e/multicluster-observability-operator:latest\n```\n\n### Run the Operator in the Cluster\n\n1. Create the `open-cluster-management-observability` namespace if it doesn't exist:\n\n```bash\nkubectl create ns open-cluster-management-observability\n```\n\n2. Deploy the minio service which acts as storage service of the multicluster observability:\n\n```bash\nkubectl -n open-cluster-management-observability apply -k examples/minio\n```\n\n3. Replace the operator image and deploy the multicluster-observability-operator:\n\n```bash\nmake deploy IMG=quay.io/\u003cYOUR_USERNAME_IN_QUAY\u003e/multicluster-observability-operator:latest\n```\n\n4. Deploy the multicluster-observability-operator CR:\n\n```bash\nkubectl apply -f operators/multiclusterobservability/config/samples/observability_v1beta2_multiclusterobservability.yaml\n```\n\n5. Verify all the components for the Multicluster Observability are starting up and running:\n\n```bash\nkubectl -n open-cluster-management-observability get pod\nNAME                                                       READY   STATUS    RESTARTS   AGE\nminio-79c7ff488d-72h65                                     1/1     Running   0          9m38s\nobservability-alertmanager-0                               3/3     Running   0          7m17s\nobservability-alertmanager-1                               3/3     Running   0          6m36s\nobservability-alertmanager-2                               3/3     Running   0          6m18s\nobservability-grafana-85fdc8c48d-j67j6                     2/2     Running   0          7m17s\nobservability-grafana-85fdc8c48d-wnltt                     2/2     Running   0          7m17s\nobservability-observatorium-api-69cfff4c95-bpw5s           1/1     Running   0          7m2s\nobservability-observatorium-api-69cfff4c95-gbh7b           1/1     Running   0          7m2s\nobservability-observatorium-operator-5df6b7949c-kbpmp      1/1     Running   0          7m17s\nobservability-rbac-query-proxy-d44df47c4-9ccdn             2/2     Running   0          7m15s\nobservability-rbac-query-proxy-d44df47c4-rtcgh             2/2     Running   0          6m50s\nobservability-thanos-compact-0                             1/1     Running   0          7m2s\nobservability-thanos-query-79c4d9488b-bd5sf                1/1     Running   0          7m3s\nobservability-thanos-query-79c4d9488b-d7wzt                1/1     Running   0          7m3s\nobservability-thanos-query-frontend-6fdb5d4946-rgblb       1/1     Running   0          7m3s\nobservability-thanos-query-frontend-6fdb5d4946-shsz2       1/1     Running   0          7m3s\nobservability-thanos-query-frontend-memcached-0            2/2     Running   0          7m3s\nobservability-thanos-query-frontend-memcached-1            2/2     Running   0          6m37s\nobservability-thanos-query-frontend-memcached-2            2/2     Running   0          6m33s\nobservability-thanos-receive-controller-6b446c5576-hj6xl   1/1     Running   0          7m3s\nobservability-thanos-receive-default-0                     1/1     Running   0          7m2s\nobservability-thanos-receive-default-1                     1/1     Running   0          6m20s\nobservability-thanos-receive-default-2                     1/1     Running   0          5m50s\nobservability-thanos-rule-0                                2/2     Running   0          7m3s\nobservability-thanos-rule-1                                2/2     Running   0          6m27s\nobservability-thanos-rule-2                                2/2     Running   0          5m56s\nobservability-thanos-store-memcached-0                     2/2     Running   0          7m3s\nobservability-thanos-store-memcached-1                     2/2     Running   0          6m37s\nobservability-thanos-store-memcached-2                     2/2     Running   0          6m33s\nobservability-thanos-store-shard-0-0                       1/1     Running   2          7m3s\nobservability-thanos-store-shard-1-0                       1/1     Running   2          7m3s\nobservability-thanos-store-shard-2-0                       1/1     Running   2          7m3s\n```\n\n### What is next\n\nAfter a successful deployment, you can run the following command to check if you have OCP cluster as a managed cluster.\n\n```bash\nkubectl get managedcluster --show-labels\n```\n\nIf there is no `vendor=OpenShift` label exists in your managed cluster, you can manually add this label with this command `kubectl label managedcluster \u003cmanaged cluster name\u003e vendor=OpenShift`\n\nThen you should be able to have `metrics-collector` pod is running:\n\n```bash\nkubectl -n open-cluster-management-addon-observability get pod\nendpoint-observability-operator-5c95cb9df9-4cphg   1/1     Running   0          97m\nmetrics-collector-deployment-6c7c8f9447-brpjj      1/1     Running   0          96m\n```\n\nExpose the thanos query frontend via route by running this command:\n\n```bash\ncat \u003c\u003c EOF | kubectl -n open-cluster-management-observability apply -f -\nkind: Route\napiVersion: route.openshift.io/v1\nmetadata:\n  name: query-frontend\nspec:\n  port:\n    targetPort: http\n  wildcardPolicy: None\n  to:\n    kind: Service\n    name: observability-thanos-query-frontend\nEOF\n```\n\nYou can access the thanos query UI via browser by inputting the host from `oc get route -n open-cluster-management-observability query-frontend`. There should have metrics available when you search the metrics `:node_memory_MemAvailable_bytes:sum`. The available metrics are listed [here](https://github.com/stolostron/multicluster-observability-operator/blob/main/operators/multiclusterobservability/manifests/base/config/metrics_allowlist.yaml)\n\n### Uninstall the Operator in the Cluster\n\n1. Delete the multicluster-observability-operator CR:\n\n```bash\nkubectl -n open-cluster-management-observability delete -f operators/multiclusterobservability/config/samples/observability_v1beta2_multiclusterobservability.yaml\n```\n\n2. Delete the multicluster-observability-operator:\n\n```bash\nmake undeploy\n```\n\n3. Delete the minio service:\n\n```bash\nkubectl -n open-cluster-management-observability delete -k examples/minio\n```\n\n4. Delete the `open-cluster-management-observability` namespace:\n\n```bash\nkubectl delete ns open-cluster-management-observability\n```\n\nRebuild Image: Wed Jan 25 15:08:26 EST 2023","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstolostron%2Fmulticluster-observability-operator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstolostron%2Fmulticluster-observability-operator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstolostron%2Fmulticluster-observability-operator/lists"}