Tyler's Site

Abstract

The last post of this series worked through replacing failed disks with ZFS. This post will be on working with ZFS snapshots and data recovery using ZFS; for the purposes of this post, I am going to be using a ZFS on root FreeBSD machine and only working with the one zroot pool that is the default in FreeBSD when installing ZFS on root.

Taking snapshots with ZFS

By default non-root users have very little privilege when it comes to ZFS; this is a good thing, but makes permissions more annoying and is overall bad practice. So, the first thing to do is to grant another user necessary permissions within ZFS.

# The following command must be run as root
# substitue ${USER} with the username that will need ZFS permission
zfs allow -u ${USER} canmount,compression,create,destroy,hold,mount,mountpoint,receive,send,snapshot zroot/ROOT

For more information on what these permissions are/do, read the zfs-allow man page

Now the user specified in the command above should be able to create a snapshot, as well as use some other features of ZFS. Taking a snapshot of the zroot/ROOT pool is as simple as running:

# This would take a snapshot of zroot/ROOT with the tag of bk01
zfs snapshot zroot/ROOT@bk01
# This would recursively take a snapshot of zroot/ROOT with the tag of bk02
zfs snapshot -r zroot/ROOT@bk02

While that is great, that doesn’t actually back up the entire system. Running a zfs list command on a standard zfs on root install of FreeBSD will turn up something like the following:

zfs list
NAME                 USED  AVAIL  REFER  MOUNTPOINT
zroot                959M  25.7G    66K  /zroot
zroot/ROOT           956M  25.7G    66K  none
zroot/ROOT/default   956M  25.7G   951M  /
zroot/home           400K  25.7G    67K  /home
zroot/home/${USER}   333K  25.7G   179K  /home/${USER}
zroot/tmp            134K  25.7G    73K  /tmp
zroot/usr            198K  25.7G    66K  /usr
zroot/usr/ports       66K  25.7G    66K  /usr/ports
zroot/usr/src         66K  25.7G    66K  /usr/src
zroot/var            741K  25.7G    66K  /var
zroot/var/audit       68K  25.7G    68K  /var/audit
zroot/var/crash     66.5K  25.7G  66.5K  /var/crash
zroot/var/log        340K  25.7G   212K  /var/log
zroot/var/mail       134K  25.7G    77K  /var/mail
zroot/var/tmp         67K  25.7G    67K  /var/tmp

The zpool/ROOT pool only contains the basic installation files of the system; backing up from zpool/ROOT@bk02 will be missing the /home, /tmp, /usr, and /var directories from the original system. Getting a snapshot of those can be done by running the following:

# This will create a full system snapshot with the tag full-bk01
zfs snapshot -r zroot@full-bk01

To get a list of available snapshots on the system, run zfs list -t snapshot. This is some test output from the VM I am using for testing that had a snapshot taken of the entire system called bk01:

zfs list -t snapshot
NAME                      USED  AVAIL  REFER  MOUNTPOINT
zroot@bk01                  0B      -    66K  -
zroot/ROOT@bk01             0B      -    66K  -
zroot/ROOT/default@bk01  4.99M      -   948M  -
zroot/home@bk01             0B      -    67K  -
zroot/home/${USER}@bk01   154K      -   180K  -
zroot/tmp@bk01             61K      -    73K  -
zroot/usr@bk01              0B      -    66K  -
zroot/usr/ports@bk01        0B      -    66K  -
zroot/usr/src@bk01          0B      -    66K  -
zroot/var@bk01              0B      -    66K  -
zroot/var/audit@bk01        0B      -    68K  -
zroot/var/crash@bk01        0B      -  66.5K  -
zroot/var/log@bk01        128K      -   177K  -
zroot/var/mail@bk01      56.5K      -  72.5K  -
zroot/var/tmp@bk01          0B      -    67K  -

Next, let’s see how to send those snapshots off to some backup storage in case the system fails.

Sending snapshots to a backup server

Having a backup file of a system is great, but it needs to be accessible from another location in case the system fails so the original machine can be restored to a functioning state. Thankfully doing that with ZFS is decently straight-forward. The tool to do this is zfs send; these examples only go over sending the snapshots as files, however, they can also be sent as entire datasets to another system using ZFS. I had issues doing this with a root on ZFS snapshot, but that might have been a skill issue that could have been overcome by someone that is more knowledgeable. To send the snapshot to a file, run the following:

# This will create a file called `full-backup.zfs` that will contain the snapshot `zpool@full-bk01`
# this snapshot can be sent with standard tools like scp, rsync, or stored on a fileshare.
zfs send -R zpool@full-bk01 > full-backup.zfs
# This example will send the snapshot `zpool@full-bk01` to be compressed via gzip, then to a remote machine called `backup.server`
# The compressed file will then be available on that server
zfs send -R zpool@full-bk01 | gzip | ssh ${USER}@backup.server "cat > /backup/directory/location/full-bk01.zfs.gz"

Now that the snapshot has been moved to another system as a proper backup; let’s go over how to actually use the snapshot to recover lost/destroyed data.

Restoring from snapshots

Now that we have a proper backup, and it is located elsewhere besides the machine, we need to figure out how to actually utilize it like a backup. There are three main cases that I am going to cover in this particular blog post that should cover most situations.

Restoring files, a la carte

This case assumes that some (probably user) error happened and the system itself is okay, but some files were deleted or modified in a negative way and need to be restored/recovered. Assuming we have a snapshot that contains the necessary files (having regular snapshots with a good naming convention is going to save you here), this can be done very easily. ZFS has a hidden directory .zfs throughout the file system for all the directories that a snapshot exists for. Within that .zfs directory it will show the different snapshots by their tag name (for example if the snapshots are: zroot@bk01 and zroot@bk02 there will be bk01 and bk02 directories), and within those directories will be the files that were there at the time of the snapshot. Accessing and restoring the files on the machine can be done by:

# assuming we are trying to recover `file.md` that was contained in snapshot `zroot@full-bk01`
# cd into the snapshot dir
cd ~/.zfs/snapshot/full-bk01
# copy the file back to the file regular file system
cp -r file.md ~/

The simplicity of this file recovery method is amazing, it is odd that the .zfs directory doesn’t show up when running a ls -a, but running a cd .zfs will work. However, recovering files feels like just a normal and mundane file operation that any Linux or BSD user should be comfortable with.

Rolling back the file system

This example assumes that something has gone quite wrong, and the entire system needs to be taken back to a previous snapshot state (this could also be done with specific datasets, i.e. zpool/home). For this we use the zfs rollback command. It is also worth noting that I did not set those permissions up for a non-root user in this blog post, so unless you see rollback somewhere when running zfs allow on your pool and/or dataset, then you will have to run it as root as well. I think I prefer rollbacks being segregated to root users as it could potentially be damaging, and people should think about commands more carefully when running as root anyway, but to roll the system back run:

zfs rollback -r zroot@full-bk01

This will bring the entire system to the state that it was in during the time of taking the snapshot zroot@full-bk01. To rollback a specific dataset:

# assuming rolling back the zroot/tmp dataset at snapshot full-bk01
zfs rollback -r zroot/tmp@full-bk01

Entire root file system

The final example assumes that the entire system has been corrupted/deleted/caught fire/nuked/etc and has to be fully recovered from backup onto a brand new hard drive. Make sure the new drive is connected to the system, and boot a live FreeBSD environment. I used mfsBSD (found thanks to this blog post) as it allowed me to mount a zpool to /mnt where as the standard live FreeBSD system would not allow that as it was read-only.

# assuming the drive is `ada0`
# also assuming a basic FreeBSD installation partition scheme
gpart create -s gpt ada0
gpart add -t freebsd-boot -s 512k ada0
gpart add -t freebsd-swap -s 2G ada0
gpart add -t freebsd-zfs -l disk0 ada0
# don't forget this otherwise system will not boot
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada0

From here, we are ready to actually get the backup onto the system and restore it. It’s worth noting that there are a lot of different ways to get the snapshot file from the backup machine onto the machine we are trying to recover (recovery machine). We can do a pull, where we are logged into the recovery machine and transfer the file from the backup server to the recovery machine by “pulling” it off. We can do a push, where we log into the backup machine, and push a copy of the file onto the recovery machine. We could also do something like transfer the file via USB flash drive.

zpool create -d -o altroot=/mnt zroot ada0p3
# Example of a "pull", pulling snapshot off of the backup server using the recovery machine
ssh ${USER}@backup.server "cat /path/to/backup/full-bk01.zfs.gz | gunzip | zfs receive -vF zroot"
# Example of a push, pushing the snapshot from the backup server onto the recovery machine
cat /path/to/backup/full-bk01.zfs.gz | gunzip | ssh ${USER}@recovery.machine "zfs receive -vF zroot"
# also do not forget this otherwise system will not boot
zpool set bootfs=zroot/ROOT/default zroot

Closing thoughts

ZFS is a lot easier to learn than I thought it would be, however, that does not make it simple. It is definitely a great choice for a sys admin trying to make life easier once the sys admin knows how to use it.

Researching resources

Some of the resources that I found helpful while working on this blog post.