Backups

Page content

For my first post, I’d like to discuss some of my strategy that I use for backups.

Backups are one of those things that most people realize are important, yet most people don’t really do them consistently. Aside from that, I’ve found that very few people test their backups.

There are two scenarios that backups are important for:

  1. Disaster recovery. After a hard disk crash or other failure, how can I recovery a machine to the state it was in before the crash.
  2. Accident recovery. When something is accidentally erased or modified, how can I recovery that file or directory in a state it was in in the past.

These scenarios place different demands on the backup system. They also combine in interesting ways. For example, if after a disaster recovery, how can an older version of a file be restored.

What I do

I’ll start by explaining my system configuration, and what I’m currently backing up.

My primary development system is a Ubuntu 17.04 system with ‘/’ mounted as an ext4 filesystem on an NVMe SSD. I also have a 5-volume spinning disk ZFS raid-z2 setup. My backups are done to ZFS filesystems, where I can take advantage of snapshots and other features of ZFS.

This post will be concerned with the backup of ‘/’. A later post will cover backups of the rest of the contents of the ZFS volume.

Each daily backup consists of the following steps:

  1. Use rsync to mirror the ‘/’ to a ZFS filesystem. I use a bind mount so that this is able to backup things that might be hidden by mountpoints (such as the small /dev that is present.

  2. Run rsure to capture the integrity information for each snapshot. This allows me to verify a given restore or test restore is correct.

  3. Create a ZFS snapshot of a form tagnnnn-yyyymmddhhmmss. The ’nnnn’ is an incrementing number, which I currently use to help expire old snapshots. The tag is just a small string to identify the current backup series.

  4. Use borg backup to back this snapshot to a borg back on the ZFS volume. Again, a bind mount is needed so that the snapshots appear in the same place to borg. This adds additional redundancy.

  5. Use zfs send/receive to backup the ZFS snapshot above to an external drive.

  6. Use borg backup to back up a few key directories to another ZFS volume.

  7. Using s3fs and rsync, clone this subset borg backup to Amazon S3 for the most important offsite backups.

Recovery

Since there are multiple places and methods of my backup, I am able to use different techniques to restore, depending on what has been destroyed, and what is left.

I can restore from the ZFS snapshot using tar or rsync. This is a likely recovery if the NVMe drive were the only thing destroyed.

I can also restore from the external drive mirror. This should work if the computer is destroyed or fails. Ideally, I would rotate these external drives so that they wouldn’t be destroyed with the computer (for example a power surge).

My key data can be restored from Amazon S3 in the desperate case. I have a printout of the backup key (with a QR code, nifty feature of Borg backup).

Improvements

The two areas of improvement for me to make to my backup strategy are:

  • Backing up more data. I have a lot of large things, mostly older data, but also some movies and music. I mostly have ad-hoc clones of this on various other drives, and I could be much better organized.

  • Better offsite backups. I have two key directories that are backed up off-site. My encfs volume that holds much personal data (finances, passwords, some documents, etc). It also has most of the dotfiles I use on my machines. I use syncthing to sync this between several machines, including one or two hosts in the cloud. The other volume is my work data, that I use borg to regularly back up to S3. Ideally, much more of my data would be backed up, on either cloud machines, or into S3. With infrequent access backups on S3, the pricing is quite reasonable.