{"id":40219346,"url":"https://github.com/osodevops/kafka-backup-operator","last_synced_at":"2026-04-21T13:05:28.059Z","repository":{"id":327982106,"uuid":"1107228567","full_name":"osodevops/kafka-backup-operator","owner":"osodevops","description":"Kubernetes operator for automated Kafka backup and disaster recovery. Supports scheduled backups, point-in-time recovery, multi-cloud storage (S3, Azure, GCS), and Azure Workload Identity.","archived":false,"fork":false,"pushed_at":"2026-04-16T09:32:18.000Z","size":465,"stargazers_count":3,"open_issues_count":5,"forks_count":2,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-16T11:31:24.661Z","etag":null,"topics":["azure","backup","disaster-recovery","helm","kafka","kube-rs","kubernetes","operator","rust","s3"],"latest_commit_sha":null,"homepage":"https://kafkabackup.com/","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/osodevops.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-11-30T20:21:52.000Z","updated_at":"2026-04-16T09:32:23.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/osodevops/kafka-backup-operator","commit_stats":null,"previous_names":["osodevops/kafka-backup-operator"],"tags_count":11,"template":false,"template_full_name":null,"purl":"pkg:github/osodevops/kafka-backup-operator","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/osodevops%2Fkafka-backup-operator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/osodevops%2Fkafka-backup-operator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/osodevops%2Fkafka-backup-operator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/osodevops%2Fkafka-backup-operator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/osodevops","download_url":"https://codeload.github.com/osodevops/kafka-backup-operator/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/osodevops%2Fkafka-backup-operator/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32093164,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-21T11:25:29.218Z","status":"ssl_error","status_checked_at":"2026-04-21T11:25:28.499Z","response_time":128,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["azure","backup","disaster-recovery","helm","kafka","kube-rs","kubernetes","operator","rust","s3"],"created_at":"2026-01-19T22:02:39.443Z","updated_at":"2026-04-21T13:05:28.020Z","avatar_url":"https://github.com/osodevops.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Kafka Backup Operator\n\n[![License](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)\n[![Rust](https://img.shields.io/badge/rust-1.75%2B-orange.svg)](https://www.rust-lang.org/)\n[![Kubernetes](https://img.shields.io/badge/kubernetes-1.26%2B-326CE5.svg)](https://kubernetes.io/)\n\nA Kubernetes operator for automated Kafka backup and disaster recovery. Built with Rust using [kube-rs](https://kube.rs/) for high performance and reliability.\n\n## Features\n\n- **Scheduled Backups** - Cron-based automatic backups with configurable retention\n- **Point-in-Time Recovery (PITR)** - Restore data to any specific timestamp\n- **Multi-Cloud Storage** - Support for PVC, S3, Azure Blob Storage, and GCS\n- **Azure Workload Identity** - Secure, secretless authentication for Azure\n- **Compression** - LZ4 and Zstd compression support with configurable levels\n- **Checkpointing** - Resumable backups that survive pod restarts\n- **Rate Limiting** - Control backup/restore throughput to minimize cluster impact\n- **Circuit Breaker** - Automatic failure detection and recovery\n- **Topic Mapping** - Restore to different topic names or partitions\n- **Consumer Offset Management** - Reset and rollback consumer group offsets\n- **Validation Evidence** - Validate backups and generate compliance evidence reports\n- **Prometheus Metrics** - Full observability with built-in metrics endpoint\n\n## Quick Start\n\n### Prerequisites\n\n- Kubernetes cluster (1.26+)\n- Helm 3.x\n- A running Kafka cluster\n\n### Installation\n\n```bash\n# Add the OSO DevOps Helm repository\nhelm repo add oso https://osodevops.github.io/helm-charts/\nhelm repo update\n\n# Install the operator\nhelm install kafka-backup-operator oso/kafka-backup-operator \\\n  --namespace kafka-backup-system \\\n  --create-namespace\n```\n\n### Create Your First Backup\n\n```yaml\napiVersion: kafka.oso.sh/v1alpha1\nkind: KafkaBackup\nmetadata:\n  name: my-backup\n  namespace: kafka\nspec:\n  kafkaCluster:\n    bootstrapServers:\n      - kafka-bootstrap:9092\n  topics:\n    - orders\n    - events\n  storage:\n    storageType: pvc\n    pvc:\n      claimName: kafka-backups\n  schedule: \"0 0 */6 * * * *\"  # Every 6 hours\n  stopAtCurrentOffsets: true\n  compression: zstd\n```\n\n```bash\nkubectl apply -f backup.yaml\n```\n\n## Custom Resource Definitions\n\nThe operator provides five CRDs for managing Kafka backup and restore operations:\n\n| CRD | Short Name | Description |\n|-----|------------|-------------|\n| `KafkaBackup` | `kb` | Define backup schedules and configurations |\n| `KafkaRestore` | `kr` | Trigger restore operations from backups |\n| `KafkaOffsetReset` | `kor` | Reset consumer group offsets |\n| `KafkaOffsetRollback` | `korb` | Rollback offsets after failed restores |\n| `KafkaBackupValidation` | `kbv` | Validate backups and produce evidence reports |\n\n## Upgrade Notes\n\n### 1.0.0\n\nThis release aligns the operator CRDs and adapters with `kafka-backup-core` `v0.12.0`.\n\n- `checkpoint.enabled` no longer makes a `KafkaBackup` run continuously. Set `continuous: true` explicitly for streaming backups.\n- Scheduled point-in-time backups should set `stopAtCurrentOffsets: true` so the backup exits after it reaches the high watermarks captured at start.\n- `KafkaBackup` can now snapshot consumer group offsets with `consumerGroupSnapshot: true`; `KafkaRestore` can load those snapshots with `autoConsumerGroups: true`.\n- `KafkaRestore` now exposes `repartitioning`, `produceAcks`, `produceTimeoutMs`, and `purgeTopics`.\n- S3-compatible endpoints can use `storage.s3.pathStyle` and `storage.s3.allowHttp`; `allowHttp` logs a warning because it may enable plaintext object storage traffic.\n- `kafkaCluster.connection.connectionsPerBroker` can be tuned on backup, restore, validation, offset reset, and offset rollback resources.\n- Azure storage authentication validation now accepts the adapter-supported methods: workload identity, service principal, SAS token, account key, or default credential fallback.\n\n## Configuration Examples\n\n### Backup to Azure Blob Storage (with Workload Identity)\n\n```yaml\napiVersion: kafka.oso.sh/v1alpha1\nkind: KafkaBackup\nmetadata:\n  name: production-backup\nspec:\n  kafkaCluster:\n    bootstrapServers:\n      - kafka-bootstrap:9092\n    securityProtocol: SASL_SSL\n    tlsSecret:\n      name: kafka-tls\n    saslSecret:\n      name: kafka-sasl\n      mechanism: SCRAM-SHA-512\n  topics:\n    - orders\n    - inventory\n    - events\n  storage:\n    storageType: azure\n    azure:\n      container: kafka-backups\n      accountName: mystorageaccount\n      prefix: production\n      useWorkloadIdentity: true\n  schedule: \"0 0 2 * * * *\"  # Daily at 2 AM\n  compression: zstd\n  compressionLevel: 3\n  stopAtCurrentOffsets: true\n  includeOffsetHeaders: true\n  sourceClusterId: production\n  consumerGroupSnapshot: true\n  checkpoint:\n    enabled: true\n    intervalSecs: 30\n```\n\n### Backup to S3\n\n```yaml\napiVersion: kafka.oso.sh/v1alpha1\nkind: KafkaBackup\nmetadata:\n  name: s3-backup\nspec:\n  kafkaCluster:\n    bootstrapServers:\n      - kafka:9092\n  topics:\n    - my-topic\n  storage:\n    storageType: s3\n    s3:\n      bucket: my-kafka-backups\n      region: eu-west-1\n      endpoint: http://minio.storage.svc.cluster.local:9000\n      pathStyle: true\n      allowHttp: true\n      prefix: backups\n      credentialsSecret:\n        name: aws-credentials\n        accessKeyIdKey: AWS_ACCESS_KEY_ID\n        secretAccessKeyKey: AWS_SECRET_ACCESS_KEY\n  schedule: \"0 0 */4 * * * *\"\n```\n\n### Restore from Backup\n\n```yaml\napiVersion: kafka.oso.sh/v1alpha1\nkind: KafkaRestore\nmetadata:\n  name: restore-orders\nspec:\n  backupRef:\n    name: production-backup\n    backupId: \"production-backup-20251210-020000\"  # Optional: specific backup\n  kafkaCluster:\n    bootstrapServers:\n      - kafka-bootstrap:9092\n  topics:\n    - orders\n  # Optional: Point-in-time recovery\n  pitr:\n    endTime: \"2025-12-10T12:00:00Z\"\n  # Optional: Restore to different topic\n  topicMapping:\n    orders: orders-restored\n  repartitioning:\n    orders-restored:\n      strategy: murmur2\n      targetPartitions: 6\n  createTopics: true\n  produceAcks: -1\n  produceTimeoutMs: 30000\n  autoConsumerGroups: true\n  purgeTopics: false\n  # Safety: Create snapshot before restore\n  rollback:\n    snapshotBeforeRestore: true\n    autoRollbackOnFailure: true\n```\n\n### Reset Consumer Offsets\n\n```yaml\napiVersion: kafka.oso.sh/v1alpha1\nkind: KafkaOffsetReset\nmetadata:\n  name: reset-consumer\nspec:\n  kafkaCluster:\n    bootstrapServers:\n      - kafka-bootstrap:9092\n  consumerGroups:\n    - my-consumer-group\n  topics:\n    - orders\n  resetStrategy: to-earliest\n```\n\n## Helm Values\n\nKey configuration options for the Helm chart:\n\n```yaml\n# values.yaml\nreplicaCount: 1\n\nimage:\n  repository: ghcr.io/osodevops/kafka-backup-operator\n  tag: \"\"  # Defaults to appVersion\n\nserviceAccount:\n  create: true\n  annotations: {}\n\n# Azure Workload Identity\nazureWorkloadIdentity:\n  enabled: false\n  clientId: \"\"\n\n# Logging\nlogging:\n  level: \"info,kafka_backup_operator=debug\"\n\n# Metrics\nmetrics:\n  enabled: true\n  serviceMonitor:\n    enabled: false\n    interval: 30s\n\n# Resources\nresources:\n  requests:\n    cpu: 100m\n    memory: 128Mi\n  limits:\n    cpu: 500m\n    memory: 512Mi\n\n# Extra pod volumes and mounts\nextraVolumes:\n  - name: kafka-tls\n    secret:\n      secretName: kafka-tls-certs\n  - name: custom-config\n    configMap:\n      name: kafka-backup-config\n\nextraVolumeMounts:\n  - name: kafka-tls\n    mountPath: /certs/kafka\n    readOnly: true\n  - name: custom-config\n    mountPath: /config\n    readOnly: true\n\n# Extra env vars for custom CA trust / endpoint behavior\nextraEnv:\n  - name: SSL_CERT_FILE\n    value: /etc/internal-certs/ca.crt\n```\n\n## Azure Workload Identity Setup\n\nFor secure, secretless authentication to Azure Blob Storage:\n\n```bash\n# Enable Workload Identity on AKS\naz aks update --resource-group myRG --name myAKS \\\n  --enable-oidc-issuer --enable-workload-identity\n\n# Create managed identity\naz identity create --resource-group myRG --name kafka-backup-identity\n\n# Assign Storage Blob Data Contributor role\naz role assignment create \\\n  --assignee-object-id $(az identity show -g myRG -n kafka-backup-identity --query principalId -o tsv) \\\n  --role \"Storage Blob Data Contributor\" \\\n  --scope /subscriptions/.../storageAccounts/mystorageaccount\n\n# Create federated credential\naz identity federated-credential create \\\n  --resource-group myRG \\\n  --identity-name kafka-backup-identity \\\n  --name kafka-backup-fedcred \\\n  --issuer $(az aks show -g myRG -n myAKS --query oidcIssuerProfile.issuerUrl -o tsv) \\\n  --subject system:serviceaccount:kafka-backup-system:kafka-backup-operator \\\n  --audience api://AzureADTokenExchange\n\n# Install with Workload Identity enabled\nhelm install kafka-backup-operator oso/kafka-backup-operator \\\n  --namespace kafka-backup-system \\\n  --set azureWorkloadIdentity.enabled=true \\\n  --set azureWorkloadIdentity.clientId=$(az identity show -g myRG -n kafka-backup-identity --query clientId -o tsv)\n```\n\nSee [docs/azure-workload-identity.md](docs/azure-workload-identity.md) for detailed setup instructions.\n\n## Monitoring\n\nThe operator exposes Prometheus metrics on port 8080:\n\n| Metric | Description |\n|--------|-------------|\n| `kafka_backup_reconciliations_total` | Total reconciliation attempts |\n| `kafka_backup_reconcile_duration_seconds` | Reconciliation duration histogram |\n| `kafka_backup_backups_total` | Total backups by status |\n| `kafka_backup_backup_size_bytes` | Backup size in bytes |\n| `kafka_backup_backup_records` | Records processed |\n| `kafka_backup_restores_total` | Total restores by status |\n\n### ServiceMonitor (Prometheus Operator)\n\n```yaml\n# Enable in Helm values\nmetrics:\n  serviceMonitor:\n    enabled: true\n    interval: 30s\n    labels:\n      release: prometheus\n```\n\n## Disaster Recovery Workflow\n\n1. **Normal Operation**: `KafkaBackup` runs on schedule, storing backups to cloud storage\n2. **Disaster Occurs**: Kafka cluster fails or data is corrupted\n3. **Recovery**:\n   - Identify the backup to restore from: `kubectl get kafkabackup my-backup -o yaml`\n   - Create a `KafkaRestore` resource pointing to the backup\n   - Monitor progress: `kubectl get kafkarestore -w`\n4. **Post-Recovery**: Optionally reset consumer offsets with `KafkaOffsetReset`\n\n## Architecture\n\n```\n┌─────────────────────────────────────────────────────────────────┐\n│                    Kubernetes Cluster                           │\n│  ┌───────────────────────────────────────────────────────────┐  │\n│  │              kafka-backup-operator                        │  │\n│  │  ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │  │\n│  │  │   Backup    │ │   Restore   │ │   Offset Reset/     │ │  │\n│  │  │ Controller  │ │ Controller  │ │   Rollback Ctrl     │ │  │\n│  │  └──────┬──────┘ └──────┬──────┘ └──────────┬──────────┘ │  │\n│  │         │               │                    │            │  │\n│  │         └───────────────┼────────────────────┘            │  │\n│  │                         │                                 │  │\n│  │                         ▼                                 │  │\n│  │              ┌─────────────────────┐                      │  │\n│  │              │  kafka-backup-core  │                      │  │\n│  │              │     (Rust lib)      │                      │  │\n│  │              └─────────────────────┘                      │  │\n│  └───────────────────────────────────────────────────────────┘  │\n│                              │                                   │\n│                              ▼                                   │\n│  ┌───────────────────────────────────────────────────────────┐  │\n│  │                    Kafka Cluster                          │  │\n│  └───────────────────────────────────────────────────────────┘  │\n└─────────────────────────────────────────────────────────────────┘\n                               │\n                               ▼\n┌─────────────────────────────────────────────────────────────────┐\n│                        Cloud Storage                            │\n│      ┌──────────┐    ┌──────────┐    ┌──────────┐              │\n│      │   S3     │    │  Azure   │    │   GCS    │              │\n│      │          │    │   Blob   │    │          │              │\n│      └──────────┘    └──────────┘    └──────────┘              │\n└─────────────────────────────────────────────────────────────────┘\n```\n\n## Development\n\n### Building from Source\n\n```bash\n# Clone the repository\ngit clone https://github.com/osodevops/kafka-backup-operator.git\ncd kafka-backup-operator\n\n# Build\ncargo build --release\n\n# Generate CRDs\ncargo run --bin crdgen \u003e deploy/crds/all.yaml\n\n# Run tests\ncargo test\n```\n\n### Local Development with Minikube\n\nSee [minikube/README.md](minikube/README.md) for local development setup with Confluent for Kubernetes.\n\n## Contributing\n\nContributions are welcome! Please read our [Contributing Guide](CONTRIBUTING.md) for details on the process for submitting pull requests.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Support\n\n- **Issues**: [GitHub Issues](https://github.com/osodevops/kafka-backup-operator/issues)\n- **Discussions**: [GitHub Discussions](https://github.com/osodevops/kafka-backup-operator/discussions)\n\n## Related Projects\n\n- [kafka-backup-core](https://github.com/osodevops/kafka-backup) - The core backup library\n- [OSO DevOps Helm Charts](https://github.com/osodevops/helm-charts) - Helm chart repository\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fosodevops%2Fkafka-backup-operator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fosodevops%2Fkafka-backup-operator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fosodevops%2Fkafka-backup-operator/lists"}