https://github.com/fdv/hdfsbackup
A quick and dirty backup tool with a hdfs backend
https://github.com/fdv/hdfsbackup
Last synced: 8 months ago
JSON representation
A quick and dirty backup tool with a hdfs backend
- Host: GitHub
- URL: https://github.com/fdv/hdfsbackup
- Owner: fdv
- License: mit
- Created: 2017-05-12T17:24:30.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2017-05-15T06:16:11.000Z (about 9 years ago)
- Last Synced: 2025-05-12T18:17:34.103Z (about 1 year ago)
- Language: Shell
- Size: 16.6 KB
- Stars: 5
- Watchers: 1
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# HDFS Backup
HDFS backup is a quick and dirty backup tool using a HDFS cluster as a backend. It uses [Colin Marc HDFS client](https://github.com/colinmarc/hdfs) and 2 configuration files:
- hdfsbackup.cfg
- includes.cfg
The HDFS client is expected to be named gohdfs to avoid conflicts with the legitimate HDFS one.
Usage:
```
./hdfsbackup.sh [/path/to/config.cfg]
```
Or in a crontab (better for rotation):
```
0 */4 * * * ./hdfsbackup.sh [/path/to/config.cfg]
```
If you're using a crontab, make sure you don't run more often than your hourly backup retention.
Example: if you're keeping 4 hourly backups, run 4 times a day.
## TODO
- Add backup exclusion
- See how to deal with symlinks as hdfs doesn't like them