Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ossobv/planb
PlanB - automating remote backups and snapshots with zfs/rsync
https://github.com/ossobv/planb
backup django-application rsync rsync-backups zfs zfs-snapshots
Last synced: 5 days ago
JSON representation
PlanB - automating remote backups and snapshots with zfs/rsync
- Host: GitHub
- URL: https://github.com/ossobv/planb
- Owner: ossobv
- License: gpl-3.0
- Created: 2017-05-24T14:21:53.000Z (over 7 years ago)
- Default Branch: main
- Last Pushed: 2024-10-24T12:46:09.000Z (27 days ago)
- Last Synced: 2024-11-07T13:12:55.869Z (13 days ago)
- Topics: backup, django-application, rsync, rsync-backups, zfs, zfs-snapshots
- Language: Python
- Size: 1.31 MB
- Stars: 29
- Watchers: 6
- Forks: 3
- Open Issues: 3
-
Metadata Files:
- Readme: README-zpool.rst
- Changelog: CHANGES.rst
- License: LICENSE
Awesome Lists containing this project
README
Setting up ZFS for PlanB
========================**This is not a full-blown ZFS setup guide.** But it will provide some tips
to get a *zpool* up and running for *PlanB*.*Here, the most common setup using raidz2 is described with a concise
explanation of the parameters. For more information, tips and tweaks,
and why you should not skimp on non-ECC memory, the author refers you to
The Internet™*.This how-to assumes you're using ZFS 0.8.x on Linux, but it will likely
work on other versions with slight adaptations.1. `Selecting/preparing disks`_
2. `Using native ZFS encryption`_
3. `Setting up the zpool`_
4. `Explanation of zpool attributes`_
5. `Replacing faulty disks`_-------------------------
Selecting/preparing disks
-------------------------So, start with a bunch of disks. Let's say 34 10TB disks:
.. code-block:: console
# cat /proc/partitions | grep ' sd[a-z]*$' | sort -Vk4
8 0 9766436864 sda
65 160 9766436864 sdaa
65 176 9766436864 sdab
...
65 128 9766436864 sdy
65 144 9766436864 sdzThe disks don't *need* to have the same size, but it helps (and hot
spares will need to match the largest, for obvious reasons). For the common
setup, you'll use the entire disk and not a partition. (*ZFS* will do
its own partitioning, but you don't need to worry about that.)You will want to *triple check* which disks you're using. You don't want
to overwrite your operating system (OS) or some other important data.
(In my case, the OS is on separate *nvme* drives, so I can safely use all
*sdX* drives.)These 34 disks will go into three *ZFS vdevs* (see `vdevs and raidz2`_ below):
1. 10 disks
2. 10 disks
3. 10 disks
4. 4 hot spares**A pro tip** here is to use the device identifiers instead of the kernel
generated names. I don't think ZFS will have a problem finding the right
device if the kernel renames *sda* to *sdb*, but when you're swapping
defective disks, you'll be happy when you can match
*scsi-SSEAGATE_ST10000NM0226_0123* to the identifier printed on the
physical disk.So, step 1, find the drives:
.. code-block:: console
# ls -go /dev/disk/by-id/ | grep '/sda$'
lrwxrwxrwx 1 10 Jun 24 08:29 scsi-35000c500af2fd4df -> ../../sda
lrwxrwxrwx 1 10 Jun 24 08:29 scsi-SSEAGATE_ST10000NM0226_0123 -> ../../sda
lrwxrwxrwx 1 10 Jun 24 08:29 wwn-0x5000c500af2fd4df -> ../../sda# ls -go /dev/disk/by-id/ | grep 'scsi-[^ ]*_.*/sd[a-z]*$'
lrwxrwxrwx 1 9 Jun 24 08:29 scsi-SSEAGATE_ST10000NM0226_0101 -> ../../sdac
lrwxrwxrwx 1 10 Jun 24 08:29 scsi-SSEAGATE_ST10000NM0226_0123 -> ../../sda
lrwxrwxrwx 1 10 Jun 24 08:29 scsi-SSEAGATE_ST10000NM0226_0226 -> ../../sde
...# ls -go /dev/disk/by-id/ | grep 'scsi-[^ ]*_.*/sd[a-z]*$' | wc -l
34Drop them in a file somewhere:
.. code-block:: console
# ls -go /dev/disk/by-id/ | grep 'scsi-[^ ]*_.*/sd[a-z]*$' |
awk '{print $7}'
scsi-SSEAGATE_ST10000NM0226_0101
scsi-SSEAGATE_ST10000NM0226_0123
scsi-SSEAGATE_ST10000NM0226_5148
...# ls -go /dev/disk/by-id/ | grep 'scsi-[^ ]*_.*/sd[a-z]*$' |
awk '{print $7}' >disksHowever, *now they are sorted by serial number*. I don't know if the
serials are generated incrementally, but if they are, those with similar
numbers *may be part of a bad batch*. **We don't want all bad disks to
end up on the same vdev. If a vdev fails, all data is lost.**So, to counter that, a simple ``shuf`` (shuffle) of the data is
sufficient to ease my paranoia... code-block:: console
# ls -go /dev/disk/by-id/ | grep 'scsi-[^ ]*_.*/sd[a-z]*$' |
awk '{print $7}' | shuf >disksOkay, now that the disks are shuffled. Open an editor on the created
``disks`` file and prepend numbers.
``0`` (and a space) before the 10 first disks, ``1`` before the next 10,
then ``2`` and lastly ``S`` for the spares. Your file now looks like
this::0 scsi-SSEAGATE_ST10000NM0226_6351
0 scsi-SSEAGATE_ST10000NM0226_0226
0 scsi-SSEAGATE_ST10000NM0226_8412
...
1 scsi-SSEAGATE_ST10000NM0226_0123
...
S scsi-SSEAGATE_ST10000NM0226_8412That's nice, because now we can quickly get the chosen disks from that file.
For example, find ``S `` to get the 4 spares:.. code-block:: console
# awk '/^S /{print "disk/by-id/" $2}' disks
disk/by-id/scsi-SSEAGATE_ST10000NM0226_9866
disk/by-id/scsi-SSEAGATE_ST10000NM0226_5992
disk/by-id/scsi-SSEAGATE_ST10000NM0226_5900
disk/by-id/scsi-SSEAGATE_ST10000NM0226_8412---------------------------
Using native ZFS encryption
---------------------------If you're using *ZFS on Linux* 0.8.x or higher, you can use native
encryption. You should enable this on the pool directly. Now *all child
datasets* will use encryption.Don't worry about the key just yet. You can always change it, as it is a
*wrapping key* only; that is, *the key is used to decrypt the real key
which never changes.*For now, start out with a passphrase key:
.. code-block:: console
# pwgen -s 512 1
abcdef...--------------------
Setting up the zpool
--------------------If you prepared which disks you'll be using according to the method
described above, you now have a ``disks`` file with a destination
"number" and a disk identifier.Setting up three *vdevs* and a set of spares is then as easy as this:
.. code-block:: console
# zpool create -o ashift=12 \
-O canmount=off -O xattr=sa \
-O compression=lz4 -O encryption=aes-256-gcm \
-O keylocation=prompt -O keyformat=passphrase \
tank raidz2 \
$(awk '/^0 /{print "disk/by-id/" $2}' disks)# zpool add tank raidz2 $(awk '/^1 /{print "disk/by-id/" $2}' disks)
# zpool add tank raidz2 $(awk '/^2 /{print "disk/by-id/" $2}' disks)
# zpool add tank spare $(awk '/^S /{print "disk/by-id/" $2}' disks)
Check the ``zpool status``:
.. code-block:: console
# zpool status
pool: tank
state: ONLINE
scan: none requested
config:NAME STATE
tank ONLINE
raidz2-0 ONLINE
scsi-SSEAGATE_ST10000NM0226_6351 ONLINE
scsi-SSEAGATE_ST10000NM0226_0226 ONLINE
scsi-SSEAGATE_ST10000NM0226_8412 ONLINE
scsi-SSEAGATE_ST10000NM0226_... ONLINE
...
raidz2-1 ONLINE
scsi-SSEAGATE_ST10000NM0226_0123 ONLINE
scsi-SSEAGATE_ST10000NM0226_... ONLINE
scsi-SSEAGATE_ST10000NM0226_... ONLINE
scsi-SSEAGATE_ST10000NM0226_... ONLINE
...
raidz2-2 ONLINE
scsi-SSEAGATE_ST10000NM0226_... ONLINE
scsi-SSEAGATE_ST10000NM0226_... ONLINE
scsi-SSEAGATE_ST10000NM0226_... ONLINE
scsi-SSEAGATE_ST10000NM0226_... ONLINE
...
spares
scsi-SSEAGATE_ST10000NM0226_9866 AVAIL
scsi-SSEAGATE_ST10000NM0226_5992 AVAIL
scsi-SSEAGATE_ST10000NM0226_5900 AVAIL
scsi-SSEAGATE_ST10000NM0226_8412 AVAILNice and shiny!
With:
* readable device IDs instead of *kernel-generated sdX numbers*;
* shuffled disks to reduce the chance of a batch of bad disks ending up
on the same vdev.-------------------------------
Explanation of zpool attributes
-------------------------------vdevs and raidz2
~~~~~~~~~~~~~~~~A *ZFS pool* consists of one or more *vdevs* (and optionally spares,
cache and log disks).Every *vdev* itself must be redundant. *If one of the vdevs fails, your
entire pool fails.* A *minimal* safe setup would be:* one *vdev* with two *mirrored* disks.
By adding more *vdevs*, *ZFS* will do a *form of striping* on those
(i.e. more read/write speed). It makes sense to make all *vdevs* equal in
size, but it is not mandatory.Our setup uses:
* three *vdevs* with ten *raidz2* disks per *vdev* and four *spare* disks.
*raidz2* is the *ZFS* equivalent of *raid6*; in our case 8 data disks and 2
parity disks. Two disks in the same *vdev* are allowed to fail. And upon
failure, a spare is automatically activated.zpool create options
~~~~~~~~~~~~~~~~~~~~In the create commands above, we use ``ashift=12``, ``canmount=off``,
``xattr=sa``, ``compression=lz4`` and ``encryption=aes-256-gcm``:* ``ashift=12``: Because most newer disks emulate having 512byte sectors
(the default ``ashift=9``) but in reality have 4K sectors
(``ashift=12``), you'll want this option for performance.
* ``canmount=off``: Because we don't want to write in the root dataset.
* ``xattr=sa``: Lets us add extended attributes in inodes. We don't use
them for now, but they can be nice to have later.
* ``compression=lz4``: LZ4 is a relatively fast compression scheme that
gives you better performance, and improves the security of the
encryption (because of the increased entropy). *(Note that we'll
consider CRIME-based attacks (using partial compression to attack
encryption) irrelevant on the local system.)*
* ``encryption=aes-256-gcm``: Yes. We want the best native encryption we
can get now.----------------------
Replacing faulty disks
----------------------When you're dealing with a DEGRADED array, you'll want to use the ``zpool
replace`` command. Mark the failing disk offline using ``zpool offline`` and
replace that disk with the new one... code-block:: console
# zpool offline tank scsi-SSEAGATE_ST10000NM0226_0123
# ledctl locate=/dev/disk/by-id/scsi-SSEAGATE_ST10000NM0226_0123Swap the disks, and replace:
.. code-block:: console
# zpool replace tank scsi-SSEAGATE_ST10000NM0226_0123 \
/dev/disk/by-id/scsi-NEW_DISK
# ledctl locate_off=/dev/disk/by-id/scsi-NEW_DISKIf you have trouble with the finding the original disk, use ``zdb`` to
list the disks. You can then use the ``guid`` as *old* disk when
replacing.hot spares
~~~~~~~~~~When the *hot spares* work as intended, a failing disk will have been
substituted by a spare already. But you'll still need to manually swap
them out to make the array completely online.::
# zpool status
...
NAME STATE
tank DEGRADED
raidz2-0 ONLINE
scsi-SSEAGATE_ST10000NM0226_6351 ONLINE
scsi-SSEAGATE_ST10000NM0226_0226 ONLINE
scsi-SSEAGATE_ST10000NM0226_8412 ONLINE
scsi-SSEAGATE_ST10000NM0226_... ONLINE
...
raidz2-1 DEGRADED
spare-1 DEGRADED
scsi-SSEAGATE_ST10000NM0226_0123 UNAVAIL
scsi-SSEAGATE_ST10000NM0226_9866 ONLINE
scsi-SSEAGATE_ST10000NM0226_... ONLINE
scsi-SSEAGATE_ST10000NM0226_... ONLINE
scsi-SSEAGATE_ST10000NM0226_... ONLINE
...
raidz2-2 ONLINE
scsi-SSEAGATE_ST10000NM0226_... ONLINE
scsi-SSEAGATE_ST10000NM0226_... ONLINE
scsi-SSEAGATE_ST10000NM0226_... ONLINE
scsi-SSEAGATE_ST10000NM0226_... ONLINE
...
spares
scsi-SSEAGATE_ST10000NM0226_9866 INUSE
scsi-SSEAGATE_ST10000NM0226_5992 AVAIL
scsi-SSEAGATE_ST10000NM0226_5900 AVAIL
scsi-SSEAGATE_ST10000NM0226_8412 AVAILThis requires some manual action:
.. code-block:: console
# zpool detach tank scsi-SSEAGATE_ST10000NM0226_0123
Now the array should be ONLINE again, and ``scsi-SSEAGATE_ST10000NM0226_9866``
will be gone from the spares list.Use ``ledctl`` to find the broken disk and physically replace with a new
one, and lastly ``zpool add tank spare NEW_DISK``.