{"id":50790145,"url":"https://github.com/factorhouse/factor-telemetry","last_synced_at":"2026-06-12T10:30:45.017Z","repository":{"id":349886457,"uuid":"1198959814","full_name":"factorhouse/factor-telemetry","owner":"factorhouse","description":"Ready-to-use observability dashboards and telemetry integrations for Apache Kafka and Apache Flink, powered by high-fidelity metrics from Factor House.","archived":false,"fork":false,"pushed_at":"2026-04-08T00:47:45.000Z","size":1208,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-10T20:23:43.439Z","etag":null,"topics":["apache-flink","apache-kafka","factorhouse","flex","flink","grafana","grafana-dashboards","kafka","kpow","monitoring","observability","openmetrics","prometheus","telemetry"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/factorhouse.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-01T23:40:37.000Z","updated_at":"2026-04-08T00:47:49.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/factorhouse/factor-telemetry","commit_stats":null,"previous_names":["factorhouse/factor-telemetry"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/factorhouse/factor-telemetry","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/factorhouse%2Ffactor-telemetry","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/factorhouse%2Ffactor-telemetry/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/factorhouse%2Ffactor-telemetry/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/factorhouse%2Ffactor-telemetry/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/factorhouse","download_url":"https://codeload.github.com/factorhouse/factor-telemetry/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/factorhouse%2Ffactor-telemetry/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34240813,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-12T02:00:06.859Z","response_time":109,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apache-flink","apache-kafka","factorhouse","flex","flink","grafana","grafana-dashboards","kafka","kpow","monitoring","observability","openmetrics","prometheus","telemetry"],"created_at":"2026-06-12T10:30:42.930Z","updated_at":"2026-06-12T10:30:45.011Z","avatar_url":"https://github.com/factorhouse.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# Factor Telemetry\n\nWelcome to **Factor Telemetry**, a comprehensive repository of ready-to-use telemetry integrations and dashboard templates for monitoring Apache Kafka and Apache Flink environments. These configurations are designed to visualize the rich, Prometheus-compatible metrics emitted by [Factor House](https://factorhouse.io/) products, including Kpow and Flex.\n\nWhile observability is often associated with Grafana, this project is built to be platform-agnostic. The metrics exposed by Kpow and Flex can be seamlessly integrated into a wide variety of modern monitoring and alerting platforms, including Grafana, Datadog, New Relic, and more. \n\n### 🗂️ Organization\n\nInside this repository, you will find platform-specific configuration files and templates. Currently, our Grafana JSON models are located within the `grafana-templates` directory. To keep things cleanly organized, these dashboards are further divided into dedicated subfolders based on the target product:\n\n* The **`kpow`** folder contains all of our Kafka-focused dashboards (covering environments, topics, consumer groups, and Kafka Connect).\n* The **`flex`** folder contains dashboards dedicated to Flink cluster monitoring. \n\nAs we expand support for other platforms like Datadog, additional directories will be added to help you quickly deploy world-class observability into your tool of choice.\n\n## 📊 Dashboards Overview\n\n**Quality Gap of Raw JMX Data vs. High-Fidelity Metrics**\n\nThe standard approach of routing raw Kafka JMX metrics into Prometheus often leaves teams with noisy dashboards and fragile alerts. Attempting to compute meaningful business metrics, like exact consumer lag or active throughput, from raw JMX offsets using PromQL is notoriously difficult.\n\n**These templates take a different approach.** They are built on top of Kpow, which acts as a high-fidelity metrics engine. Instead of relying on JMX sidecars, Kpow directly observes your cluster and exposes pre-calculated, actionable metrics (such as exact `group_offset_lag` and `topic_end_delta`) ready for immediate visualization.\n\n📖 **Read the architectural deep-dive:** [Beyond JMX: Supercharging Grafana Dashboards with High-Fidelity Metrics](https://factorhouse.io/articles/beyond-jmx-supercharging-grafana-dashboards-with-high-fidelity-metrics)\n\n## Dashboard Templates\n\n### 1. Kafka Environment Health\n\nDesigned for Platform Teams, this dashboard provides a high-level macro view of overall cluster stability and capacity.\n\nRather than relying on raw byte counts, it surfaces derived operational health indicators. It tracks total online brokers, overall data on disk, total topics, and total consumer groups. It also visualizes cluster-wide production and consumption rates, and provides a detailed breakdown of topic activity and consumer group health (Stable, Rebalancing, Empty) to give you an instant read on the environment's status.\n\n**🔗 Quick Links:**\n* 📥 **JSON Template:** [`kafka-environment.json`](./grafana-templates/kpow/kafka-environment.json)\n* 🌐 **Grafana Gallery:** [View and Import Dashboard](https://grafana.com/grafana/dashboards/25103-kafka-environment/) *(ID: 25103)*\n\n### 2. Kafka Topic Diagnostics\n\nDesigned for data engineers and platform administrators, this dashboard provides granular visibility into the data layer.\n\nIt tracks aggregate metrics like total topics, total replica disk usage, cluster-wide read/write throughput, and non-preferred leaders. Most importantly, it visualizes per-topic production and consumption rates over time, topic size growth, and isolates the exact topics experiencing consumer lag or Under Replicated Partitions (URPs) through detailed diagnostic tables.\n\n**🔗 Quick Links:**\n* 📥 **JSON Template:** [`kafka-topic.json`](./grafana-templates/kpow/kafka-topic.json)\n* 🌐 **Grafana Gallery:** [View and Import Dashboard](https://grafana.com/grafana/dashboards/25104-kafka-topic/) *(ID: 25104)*\n\n### 3. Kafka Consumer Group Deep Dive\n\nDesigned for Application Teams, this dashboard focuses on micro-level Service Level Agreement (SLA) monitoring.\n\nInstead of generic host metrics, it visualizes the exact state of your data consumption. Key metrics include precise total lag (`group_offset_lag`) and real-time consumption rates (`group_offset_delta`). It details total assigned members and hosts, and features a clear status table tracking the exact state of every consumer group to help engineers spot stalling applications before downstream users are impacted.\n\n**🔗 Quick Links:**\n* 📥 **JSON Template:** [`kafka-consumer-group.json`](./grafana-templates/kpow/kafka-consumer-group.json)\n* 🌐 **Grafana Gallery:** [View and Import Dashboard](https://grafana.com/grafana/dashboards/25105-kafka-consumer-group/) *(ID: 25105)*\n\n### 4. Kafka Connect Operations\n\nData pipeline reliability depends heavily on integration health. This dashboard targets Kafka Connect deployments, replacing tedious API queries with instant visual feedback.\n\nIt tracks aggregate summary statistics alongside individual Connector and Task states. By mapping state labels directly to distinct visual alerts (RUNNING, PAUSED, FAILED, UNASSIGNED, UNREACHABLE), teams can immediately detect stalled integrations and isolate whether the failure exists at the connector or task level.\n\n**🔗 Quick Links:**\n* 📥 **JSON Template:** [`kafka-connect.json`](./grafana-templates/kpow/kafka-connect.json)\n* 🌐 **Grafana Gallery:** [View and Import Dashboard](https://grafana.com/grafana/dashboards/25106-kafka-connect/) *(ID: 25106)*\n\n## 🚀 Getting Started with Grafana Cloud\n\nThese instructions illustrate how to wire up Grafana Cloud's agentless **Metrics Endpoint** integration to scrape Kpow directly, without needing to manage a local Prometheus instance. \n\n### 📋 Prerequisites: Enable Kpow Telemetry\nBefore configuring Grafana Cloud, ensure that your Kpow instance is configured to expose its Prometheus metrics and that the endpoints are secured with Basic Authentication (which is strictly required by Grafana Cloud's agentless scraper). \n\n🔗 **Read the official guide:** [Enabling Kpow's Prometheus Integration](https://docs.factorhouse.io/kpow/integration/prometheus/overview)\n\n### Step 1: Configure Metrics Endpoints (Scrape Jobs)\n\nGrafana Cloud can scrape Kpow directly over the internet. You will need to create a scrape job for each of Kpow's metric endpoints. \n\n1. Log in to your Grafana Cloud portal.\n2. Navigate to **Connections** \u003e **Add new connection**.\n3. Search for and select **Metrics Endpoint**.\n4. Click **Add new scrape job** and create three separate jobs using the following URLs:\n   * `https://\u003cyour-kpow-domain\u003e/metrics/v1` (replace with your Kpow domain)\n   * `https://\u003cyour-kpow-domain\u003e/offsets/v1`\n   * `https://\u003cyour-kpow-domain\u003e/group-offsets/v1`\n   \u003e ❗ **Authentication:** The endpoints should be secured, which is strictly required by the Metrics Endpoint integration. You can select either **Basic** or **Bearer (OAuth)** authentication.\n5. Click **Test Connection** and **Save Scrape Job** for each job. Grafana will immediately start polling these endpoints and storing the data in your built-in Prometheus database.\n\n### Step 2: Check Metrics are Flowing\n\nBefore importing the dashboards, verify that Grafana Cloud is successfully receiving the data:\n\n1. In Grafana, go to the left-hand menu and click **Explore** (the compass icon).\n2. Ensure your default **Prometheus** data source is selected in the top-left dropdown (usually named `grafanacloud-\u003cyour-stack\u003e-prom`).\n3. In the query bar, type a metric like `topic_count` or `broker_count` and run the query.\n4. If you see a graph or data table populate, your connection is working perfectly!\n\n### Step 3: Create Dashboards from Templates\n\nWith the data flowing, you can now import the JSON templates provided in this repository.\n\n1. Download the `.json` files from the `grafana-templates/kpow` directory in this repo.\n2. In Grafana, navigate to **Dashboards** \u003e **New** \u003e **Import**.\n3. Upload the `.json` file (or paste the raw JSON text into the provided box) and click **Load**.\n4. At the bottom of the import options screen, you will be prompted to select a **Prometheus** data source. Select your Grafana Cloud Prometheus data source from the dropdown.\n5. Click **Import**.\n\nYour dashboard will instantly load and populate with live metrics! Repeat this process for the remaining dashboards.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffactorhouse%2Ffactor-telemetry","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffactorhouse%2Ffactor-telemetry","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffactorhouse%2Ffactor-telemetry/lists"}