https://github.com/axonops/axonops-operator
Kubernetes operator for managing the AxonOps observability stack. Deploys and configures axon-server, axon-dash, and backing databases with support for alerts, backups, repairs, Kafka management, and more.
https://github.com/axonops/axonops-operator
alerting axonops backups cassandra cassandra-database controller-runtime helm-chart kafka kafka-cluster kubebuilder kubernetes kubernetes-operator monitoring observability
Last synced: 4 days ago
JSON representation
Kubernetes operator for managing the AxonOps observability stack. Deploys and configures axon-server, axon-dash, and backing databases with support for alerts, backups, repairs, Kafka management, and more.
- Host: GitHub
- URL: https://github.com/axonops/axonops-operator
- Owner: axonops
- License: apache-2.0
- Created: 2026-03-13T14:17:29.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2026-04-09T17:49:15.000Z (2 months ago)
- Last Synced: 2026-04-09T18:24:56.298Z (2 months ago)
- Topics: alerting, axonops, backups, cassandra, cassandra-database, controller-runtime, helm-chart, kafka, kafka-cluster, kubebuilder, kubernetes, kubernetes-operator, monitoring, observability
- Language: Go
- Homepage:
- Size: 27.2 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 14
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Agents: AGENTS.md
Awesome Lists containing this project
README
# AxonOps Kubernetes Operator
A Kubernetes operator that deploys and manages the [AxonOps](https://axonops.com) control plane. It replaces both the AxonOps Helm charts and Terraform provider, giving you a single, declarative interface for running AxonOps entirely within Kubernetes.
## What It Does
- **Deploys the full AxonOps stack** — axon-server, axon-dash, axondb-timeseries, and axondb-search — from a single `AxonOpsPlatform` custom resource.
- **Manages AxonOps configuration** — alert rules, alert routes, healthchecks, dashboard templates, backups, scheduled repairs, and commitlog archives are reconciled as Kubernetes resources and kept in sync with the AxonOps API.
- **Manages Kafka resources** — topics, ACLs, and connectors for Kafka-based clusters.
- **Handles day-2 operations** — credential rotation, TLS certificate management, startup ordering, and Ingress/Gateway API configuration.
---
## CRDs
### `core.axonops.com/v1alpha1`
| Kind | Purpose |
|---|---|
| `AxonOpsPlatform` | Deploys and manages the full AxonOps server stack |
| `AxonOpsConnection` | Stores reusable API credentials for the AxonOps API |
### `alerts.axonops.com/v1alpha1`
| Kind | Purpose |
|---|---|
| `AxonOpsMetricAlert` | Metric threshold alerts |
| `AxonOpsLogAlert` | Log pattern alerts |
| `AxonOpsAlertRoute` | Alert routing and notification channels |
| `AxonOpsAlertEndpoint` | Alert notification endpoints (email, Slack, PagerDuty, etc.) |
| `AxonOpsHealthcheckHTTP` | HTTP endpoint healthchecks |
| `AxonOpsHealthcheckTCP` | TCP port healthchecks |
| `AxonOpsHealthcheckShell` | Shell script healthchecks |
| `AxonOpsDashboardTemplate` | Declarative dashboard management |
| `AxonOpsAdaptiveRepair` | Adaptive repair scheduling |
| `AxonOpsScheduledRepair` | Scheduled repair management |
| `AxonOpsCommitlogArchive` | Commitlog archive management |
| `AxonOpsSilenceWindow` | Alert silence windows |
| `AxonOpsLogCollector` | Log collector configuration |
### `backups.axonops.com/v1alpha1`
| Kind | Purpose |
|---|---|
| `AxonOpsBackup` | Backup scheduling and management |
### `kafka.axonops.com/v1alpha1`
| Kind | Purpose |
|---|---|
| `AxonOpsKafkaTopic` | Kafka topic management |
| `AxonOpsKafkaACL` | Kafka ACL management |
| `AxonOpsKafkaConnector` | Kafka connector management |
---
## Prerequisites
- Kubernetes 1.28+
- [cert-manager](https://cert-manager.io) — required only when using internal database components (TimeSeries or Search)
- [Gateway API CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) — required only when using Gateway API ingress
---
## Installation
**Install from OCI registry (recommended):**
The Helm chart is published as an OCI artifact on GitHub Container Registry. Install with:
```bash
helm upgrade --install axonops-operator \
oci://ghcr.io/axonops/charts/axonops-operator \
--version 0.1.0 \
--namespace axonops-operator-system --create-namespace
```
To see available versions, check the [releases page](https://github.com/axonops/axonops-operator/releases) or the [package registry](https://github.com/axonops/axonops-operator/pkgs/container/charts%2Faxonops-operator).
**Install from local chart source:**
```bash
helm upgrade --install axonops-operator \
./charts/axonops-operator/ \
--namespace axonops-operator-system --create-namespace
```
**Install from source with Kustomize:**
```bash
make deploy IMG=/:
```
---
## Quick Start
### All-in-one deployment (fully managed)
Deploy the complete AxonOps stack with a single resource. The operator provisions all components, generates credentials, and manages TLS certificates automatically.
> **Note:** All components must have `enabled: true` set explicitly until the defaulting webhook is implemented.
```yaml
apiVersion: core.axonops.com/v1alpha1
kind: AxonOpsPlatform
metadata:
name: axonops
namespace: axonops
spec:
server:
orgName: "my-company"
timeSeries:
enabled: true
search:
enabled: true
dashboard:
enabled: true
```
### External databases
Connect the AxonOps server to existing Cassandra and Elasticsearch/OpenSearch clusters instead of running them in-cluster.
```yaml
apiVersion: core.axonops.com/v1alpha1
kind: AxonOpsPlatform
metadata:
name: axonops
namespace: axonops
spec:
server:
orgName: "my-company"
timeSeries:
external:
hosts:
- cassandra-node1.example.com:9042
- cassandra-node2.example.com:9042
tls:
enabled: false
authentication:
secretRef: cassandra-credentials # Secret with AXONOPS_DB_USER / AXONOPS_DB_PASSWORD
search:
external:
hosts:
- https://elasticsearch.example.com:9200
tls:
enabled: true
insecureSkipVerify: true # Set to false and add certSecretRef for full TLS verification
# certSecretRef: elasticsearch-tls # Secret with ca.crt, tls.crt, tls.key
authentication:
secretRef: elasticsearch-credentials # Secret with AXONOPS_SEARCH_USER / AXONOPS_SEARCH_PASSWORD
dashboard: {}
```
> **TLS verification:** When `tls.enabled=true` and `tls.insecureSkipVerify=false`, you must also set `tls.certSecretRef` to the name of a Secret containing `ca.crt`, `tls.crt`, and `tls.key`. Without it, the operator sets `Ready=False` with reason `MissingExternalTLSCert`.
### Alert management
Create an `AxonOpsConnection` once per namespace, then reference it from alert resources.
```yaml
apiVersion: v1
kind: Secret
metadata:
name: axonops-api-key
namespace: axonops
stringData:
api-key: "your-api-key-here"
---
apiVersion: core.axonops.com/v1alpha1
kind: AxonOpsConnection
metadata:
name: axonops-api
namespace: axonops
spec:
orgId: "my-org-id"
host: "axonops.example.com"
protocol: "https"
apiKeyRef:
name: axonops-api-key
key: api-key
---
apiVersion: alerts.axonops.com/v1alpha1
kind: AxonOpsMetricAlert
metadata:
name: high-read-latency
namespace: axonops
spec:
connectionRef: axonops-api
clusterName: production-cluster
clusterType: cassandra
name: high-read-latency
operator: ">"
warningValue: 50
criticalValue: 100
duration: 15m
dashboard: Cassandra Overview
chart: Read Latency
annotations:
summary: "Cassandra read latency is high"
```
---
## AxonOpsPlatform Components
Each component can operate in **internal** (operator-managed) or **external** (user-provided) mode.
| Component | Image | Internal | External |
|---|---|---|---|
| `axon-server` | `axon-server` | StatefulSet | n/a |
| `axon-dash` | `axon-dash` | Deployment | n/a |
| `axondb-timeseries` | `axondb-timeseries` | StatefulSet | Cassandra-compatible |
| `axondb-search` | `axondb-search` | StatefulSet | Elasticsearch/OpenSearch |
### Authentication
For internal (operator-managed) database components, credentials are resolved in this priority order:
1. `authentication.secretRef` — reference an existing Secret containing `AXONOPS_DB_USER` / `AXONOPS_DB_PASSWORD` (or `AXONOPS_SEARCH_USER` / `AXONOPS_SEARCH_PASSWORD` for Search)
2. `authentication.username` / `authentication.password` — inline credentials in the CR
3. Auto-generated — operator creates and manages a Secret with random credentials
> **Note:** External components (those with `spec.*.external.hosts` set) require explicit credentials. Auto-generation does not apply to external database connections. Set `authentication.secretRef` or `authentication.username`/`authentication.password` when using external hosts.
### Ingress and Gateway API
Both Dashboard and Server endpoints (agent and API) support `ingress` and `gateway` configuration independently. You can enable one, both, or neither per endpoint.
### TLS
When using internal TimeSeries or Search components, the operator creates TLS certificates via cert-manager. cert-manager is not required for external database configurations.
### Startup ordering
The operator enforces dependency ordering: Server waits for its databases to be ready; Dashboard waits for Server.
---
## Samples
Pre-built sample resources are available under `config/samples/`:
```bash
kubectl apply -k config/samples/
```
For more detailed examples including alert configuration, K8ssandra integration, and full-stack deployments, see the [`examples/`](examples/README.md) directory.
---
## Development
```bash
make manifests # Regenerate CRDs and RBAC from kubebuilder markers
make generate # Regenerate DeepCopy methods
make fmt && make vet # Format and vet code
make lint # Run linter
make test # Run unit tests (uses envtest)
make build # Build the manager binary
```
See [CONTRIBUTING.md](CONTRIBUTING.md) for the full development workflow.
---
## Uninstall
**If installed via Helm:**
```bash
kubectl delete -k config/samples/ # Remove sample CRs
helm uninstall axonops-operator -n axonops-operator-system # Remove the operator
```
**If installed via Kustomize / Make:**
```bash
kubectl delete -k config/samples/ # Remove sample CRs
make uninstall # Remove CRDs from the cluster
make undeploy # Remove the operator from the cluster
```
---
## License
© 2026 AxonOps Limited. All rights reserved.
Licensed under the [Apache License, Version 2.0](LICENSE).