Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/itadventurer/kafka-backup
Backup and Restore for Apache Kafka
https://github.com/itadventurer/kafka-backup
backup kafka kafka-connect
Last synced: 3 days ago
JSON representation
Backup and Restore for Apache Kafka
- Host: GitHub
- URL: https://github.com/itadventurer/kafka-backup
- Owner: itadventurer
- License: apache-2.0
- Created: 2019-06-02T21:21:59.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2024-12-04T10:04:02.000Z (2 months ago)
- Last Synced: 2025-02-01T23:07:12.812Z (10 days ago)
- Topics: backup, kafka, kafka-connect
- Language: Java
- Size: 2.71 MB
- Stars: 166
- Watchers: 7
- Forks: 47
- Open Issues: 42
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-kafka - Backup and Restore topics & offsets
README
# Kafka Backup
> **Update:** I am no longer maintaining the Kafka Backup project. As an alternative, I recommend [Kannika](https://kannika.io/?utm_source=github_anatoly), a commercial backup solution developed by my friends at [Cymo](https://cymo.eu/?utm_source=github_anatoly) (and don't forget to say hello from Anatoly 😊).
> [Disclosure: I am a business partner of Cymo and may receive compensation for referrals to Kannika]
>
> Please contact me if you want to continue maintaining this project.Kafka Backup is a tool to back up and restore your Kafka data
including all (configurable) topic data and especially also consumer
group offsets. To the best of our knowledge, Kafka Backup is the only
viable solution to take a cold backup of your Kafka data and restore
it correctly.It is designed as two connectors for Kafka
Connect: A sink connector (backing data up) and a source connector
(restoring data).Currently `kafka-backup` supports backup and restore to/from the file
system.## Features
* Backup and restore topic data
* Backup and restore consumer-group offsets
* Currently supports only backup/restore to/from local file system
* Released as a jar file or packaged as a Docker image# Getting Started
**Option A) Download binary**
Download the latest release [from GitHub](https://github.com/itadventurer/kafka-backup/releases) and unzip it.
**Option B) Use Docker image**
Pull the latest Docker image from [Docker Hub](https://hub.docker.com/repository/docker/itadventurer/kafka-backup/tags)
**DO NOT USE THE `latest` STAGE IN PRODUCTION**. `latest` are automatic builds of the master branch. Be careful!
**Option C) Build from source**
Just run `./gradlew shadowJar` in the root directory of Kafka Backup. You will find the CLI tools in the `bin` directory.
## Start Kafka Backup
```sh
backup-standalone.sh --bootstrap-server localhost:9092 \
--target-dir /path/to/backup/dir --topics 'topic1,topic2'
```In Docker:
```sh
docker run -d -v /path/to/backup-dir/:/kafka-backup/ --rm \
kafka-backup:[LATEST_TAG] \
backup-standalone.sh --bootstrap-server kafka:9092 \
--target-dir /kafka-backup/ --topics 'topic1,topic2'
```You can pass options via CLI arguments or using environment variables:
| Parameter | Type/required? | Description |
|---------------------------------------------|----------------|----------------------------------------------------------------------------------------------------------------------|
| `--bootstrap-server`
`BOOTSTRAP_SERVER` | [REQUIRED] | The Kafka server to connect to |
| `--target-dir`
`TARGET_DIR` | [REQUIRED] | Directory where the backup files should be stored |
| `--topics`
`TOPICS` | | List of topics to be backed up. You must provide either `--topics` or `--topics-regex`. Not both |
| `--topics-regex`
`TOPICS_REGEX` | | Regex of topics to be backed up. You must provide either `--topics` or `--topics-regex`. Not both |
| `--max-segment-size`
`MAX_SEGMENT_SIZE` | | Size of the backup segments in bytes DEFAULT: 1GiB |
| `--command-config`
`COMMAND_CONFIG` | | Property file containing configs to be passed to Admin Client. Only useful if you have additional connection options |
| `--debug`
`DEBUG=y` | | Print Debug information |
| `--help` | | Prints this message |**Kafka Backup does not stop!** The Backup process is a continous background job that runs forever as Kafka models data as a stream without end. See [Issue 52: Support point-in-time snapshots](https://github.com/itadventurer/kafka-backup/issues/52) for more information.
## Restore data
```sh
restore-standalone.sh --bootstrap-server localhost:9092 \
--target-dir /path/to/backup/dir --topics 'topic1,topic2'
```In Docker:
```sh
docker run -v /path/to/backup/dir:/kafka-backup/ --rm \
kafka-backup:[LATEST_TAG]
restore-standalone.sh --bootstrap-server kafka:9092 \
--source-dir /kafka-backup/ --topics 'topic1,topic2'
```You can pass options via CLI arguments or using environment variables:
| Parameter | Type/required? | Description |
|---------------------------------------------|----------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `--bootstrap-server`
`BOOTSTRAP_SERVER` | [REQUIRED] | The Kafka server to connect to |
| `--source-dir`
`SOURCE_DIR` | [REQUIRED] | Directory where the backup files are found |
| `--topics`
`TOPICS` | [REQUIRED] | List of topics to restore |
| `--batch-size`
`BATCH_SIZE` | | Batch size (Default: 1MiB) |
| `--offset-file`
`OFFSET_FILE` | | File where to store offsets. THIS FILE IS CRUCIAL FOR A CORRECT RESTORATION PROCESS IF YOU LOSE IT YOU NEED TO START THE BACKUP FROM SCRATCH. OTHERWISE YOU WILL HAVE DUPLICATE DATA Default: [source-dir]/restore.offsets |
| `--command-config`
`COMMAND_CONFIG` | | Property file containing configs to be passed to Admin Client. Only useful if you have additional connection options |
| `--help`
`HELP` | | Prints this message |
| `--debug`
`DEBUG` | | Print Debug information (if using the environment variable, set it to 'y') |## More Documentation
* [FAQ](./docs/FAQ.md)
* [High Level
Introduction](./docs/Blogposts/2019-06_Introducing_Kafka_Backup.md)
* [Comparing Kafka Backup
Solutions](./docs/Comparing_Kafka_Backup_Solutions.md)
* [Architecture](./docs/Kafka_Backup_Architecture.md)
* [Tooling](./docs/Tooling.md)## License
This project is licensed under the Apache License Version 2.0 (see
[LICENSE](./LICENSE)).