With the recent news that Crashplan were doing away with their inexpensive “Home” offering, I had reason to reconsider my choice of online backup backup provider. Since description of how I achieve satisfactory safety margins and some discussion of the options available seems to be of general interest, here's a version of my notes from the process. A more complete version (geared less toward discussion and more toward exposition) is also posted on my web site.
The canonical version of all of my data is stored on my home server, currently about 15 terabytes of raw disk (five individual disks) on btrfs RAID. This is backed up in two different places and has several other redundancies.
Between these, I'm protected from most plausible failure modes that could damage my data. For instance:
There are no shortage of services for storing data in "the cloud," though not all are suitable for my intended application of storing incremental backups. Though I've been using Crashplan for years now and been generally happy with the service, I did consider some alternatives when Crashplan announced that their "home" service would no longer be available:
These two are priced such that it's difficult to tell what you'd actually end up paying if trying to do incremental backups, because they charge for storage, operations (reads and writes) and data transfer out. While you could make them work for this kind of application, they're not really designed for it.
Services like Google Drive or Dropbox may be useful to some users, but they're not designed for this kind of use case so I did not seriously consider any of them.
Despite the cost increase, I've been happy enough sticking with Crashplan- it's still cheaper than any of the per-gigabyte services given the size of my backups.
I do have other computers other than my home server that I want to back up, but that's relatively easy. For my desktop machine at home, I mostly just mount the server as a networked disk and operate directly on things stored on it. While I've previously used Crashplan's peer-to-peer functionality for laptops that aren't always on my home network, I've started using Duplicati to back up to my server via SSH instead (which works fine in conjunction with dynamic DNS so my server is connectable externally). My other servers back up to the home one with Borg, in basically the same way that other machines are using Duplicati.
So, backups: do you have them? If no, why not? Discuss and/or ask your questions.
The canonical version of all of my data is stored on my home server, currently about 15 terabytes of raw disk (five individual disks) on btrfs RAID. This is backed up in two different places and has several other redundancies.
- RAID gives robustness against failure of any single disk at a time. No data is lost from the live copy unless more than one disk is lost at any given time.
- Using regular btrfs snapshots via Snapper, read-only copies of everything are available for the recent past. This allows easy recovery from accidental deletion or modification of files without requiring much additional storage.
- Remote (internet-based) backups are continuously made to Crashplan's cloud service.
- Offline backups are made to a USB hard disk (that I don't keep at home with the server) with btrfs send, allowing me to make incremental backups that don't require bidirectional communication between the backup source and destination. The offline disk is kept unplugged when not in use.
Between these, I'm protected from most plausible failure modes that could damage my data. For instance:
- Lightning strike destroys the server and disks: offline and online backups remain available.
- Ransomware encrypts everything on the server: offline backup cannot be affected, online backup keeps old versions of files.
- Linux bug causes silent filesystem corruption and data loss: may be possible to recover from read-only snapshots on the offline disk, Crashplan should be unaffected
- Probably others, more and less mundane
There are no shortage of services for storing data in "the cloud," though not all are suitable for my intended application of storing incremental backups. Though I've been using Crashplan for years now and been generally happy with the service, I did consider some alternatives when Crashplan announced that their "home" service would no longer be available:
- AWS Glacier offers extremely low cost per gigabyte stored, but retrieving data from it is slow and expensive. It's good if you have large blobs that you want to store and retrieve as a unit (and do so very rarely), but a poor choice for incremental backups.
- Google Cloud Storage (particularly in the nearline and coldline flavors) has pretty low cost-per-gigabyte, competitive with Glacier in the coldline flavor but with much lower costs to access data stored in it.
These two are priced such that it's difficult to tell what you'd actually end up paying if trying to do incremental backups, because they charge for storage, operations (reads and writes) and data transfer out. While you could make them work for this kind of application, they're not really designed for it.
- C14 is similar to the above two pricing-wise, but it's not a commonly-used service. Data transfer and operations are free in the "intensive" flavor, which is convenient for estimation.
- B2 is basically designed for storing backups and happens to be possible to use for other applications. Pricing ends up being reasonable, with flat charges per gigabyte stored and downloaded.
- Tarsnap is a service targetted at savvy users. Doesn't support Windows and is significantly more expensive than most of the other options here though.
- Backblaze is basically "B2 but using only their client." Flat subscription fee for "unlimited" storage, but their client is Windows/Mac only.
- Crashplan used to be the same price as Backblaze for the same "unlimited" storage, but their "Small Business" offering which soon to be the only available option is twice the price. The client runs on all three major operating systems, though.
Services like Google Drive or Dropbox may be useful to some users, but they're not designed for this kind of use case so I did not seriously consider any of them.
Despite the cost increase, I've been happy enough sticking with Crashplan- it's still cheaper than any of the per-gigabyte services given the size of my backups.
I do have other computers other than my home server that I want to back up, but that's relatively easy. For my desktop machine at home, I mostly just mount the server as a networked disk and operate directly on things stored on it. While I've previously used Crashplan's peer-to-peer functionality for laptops that aren't always on my home network, I've started using Duplicati to back up to my server via SSH instead (which works fine in conjunction with dynamic DNS so my server is connectable externally). My other servers back up to the home one with Borg, in basically the same way that other machines are using Duplicati.
So, backups: do you have them? If no, why not? Discuss and/or ask your questions.