Announcement

Collapse
No announcement yet.

BTRFS snapshots and making backups - an opinion on concepts and functionality.

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    BTRFS snapshots and making backups - an opinion on concepts and functionality.

    Background and info:

    I have often said the biggest advantage of using BTRFS vs. other file systems is the usefulness of subvolumes. Basically, using subvolumes allows you to segregate parts of your install while still only using a single file system. Subvolumes divide up the space just like partitions do, but also allows all the available space to be shared among all the subvolumes without restriction.

    Subvolumes also have the additional capabilities of snapshots - like taking pictures in time, and of being transmitted from one BTRFS file system to another - a form of making a backup.

    For those who are new to BTRFS, the file system divides the available storage space several parts. For this discussion, I will refer to "data" - actual file storage, and "metadata" - file system information on what and where your data files are located.

    A snapshot must reside on the same file system as the source subvolume (the one you took a snapshot of). If you transmit a snapshot to a different file system, it ceases to be a snapshot and becomes a stand-alone subvolume - a backup copy of the source subvolume. You should realize that a snapshot, while a copy of your subvolume, is not a backup. If the file system becomes corrupted or the device dies, both your subvolume and all it's snapshots die together. This is why we use BTRFS send and receive commands to create a backup on another device.

    Note that taking a snapshot happens instantaneously, and at the time it is taken, consumes no data space. Only the metadata is actually duplicated. As time goes on and changes are made to the source subvolume, the metadata and data space consumed by the snapshot grow.

    Backups and snapshot methodologies:

    There are many methods to make backups of our data and many types. Fortunately, BTRFS has backup capability built in and it's simple: take a read-only snapshot, use "send" and "receive" commands to transmit it to another device, done. There are valid reasons why you may want a backup on another PC or external device entirely, rather than just on a second drive on your computer. I will not discuss those reasons or other backup methods that may be more suited for this purpose, other than to say that BTRFS allows you to "send" a subvolume to a file for off-line device storage or copying or restoring to a different machine.

    This discussion about backup and snapshot methodologies includes frequency (how often), quantity (how many), retention time (how long), and type of backup.

    Everyone's needs or thoughts on this will likely differ. I prefer a simple and easy approach:
    • Make a root subvolume snapshot every morning
    • Delete the oldest snapshot every morning keeping 7 days worth of snapshots,
    • Send an incremental backup every Sunday
    This seems sufficient for my needs.

    An incremental backup means I made a full backup some time ago, and then on Sunday BTRFS will let me compare last week's subvolume to this week's and send only the difference between the two subvolumes. This saves an incredible amount of time and system workload: several minutes vs. a couple seconds. I am use very fast NVME drives so if you're using SATA SSDs or platter drives the difference in time and workload would be even more significant. This results in a current backup - meaning the backup is current to last Sunday and continuously updated every week.

    Keeping historical backups:

    Recently, I was introduced to the idea of a "differential" or historical backup. A methodology where one might want to store older backups that are not updated. When I update the backup on Sunday, all the changes are transmitted to the backup - file moves, additions, AND deletions. There is no historical retention beyond this week.

    To accomplish the creation of historical backups, the first thought might be to send a full backup copy instead of an incremental copy each Sunday. Thus, I would have a number of backups that would allow me to go back as far as I had the desire and storage for. But this would also mean a very large additional amount of time and workload to create a full backup every Sunday. Additionally, each historical backup would require the full amount of space of the subvolume. In this case, 33GB is my root subvolume so each historical backup consumes approximately the same amount of space. If I wanted 10 weeks worth, then 330GB! Quite a bit.

    A better idea:

    I can actually do a current backup and historical backups at the same time and with no additional time or workload and save a huge amount of drive space!

    I simply need to continue my current snapshot and incremental backup method and add also taking a snapshot of the current backup.

    My new process is:
    • Make a root subvolume snapshot every morning
    • Delete the oldest snapshot every morning keeping 7 days worth of snapshots,
    • Make a snapshot of the root backup subvolume every Sunday
    • Then send an incremental backup every Sunday
    ​This saves a lot of time because taking an additional snapshot is virtually instantaneous and I don't have to do additional lengthy BTRFS send and receive operations. Obviously I would need to limit the number backups I keep and/or monitor the backup drive free space.

    Please Read Me
Working...
X