Remix.run Logo
100ms 7 hours ago

Managing backup exclusions strikes again. It's impossible. Either commit to backing up the full disk, including the 80% of easily regenerated/redownloaded etc. data, or risk the 0.001% critical 16 byte file that turns out to contain your Bitcoin wallet key or god knows what else. I've been bitten by this more times than I'd like to admit managing my own backups, it's hard to expect a shrink-wrapped provider to do much better. It only takes one dumb simplification like "my Downloads folder is junk, no need to back that up" combined with (no doubt, years later) downloading say a 1Password recovery PDF that you lazily decide will live in that folder, and the stage is set.

Pinning this squarely on user error. Backblaze could clearly have done better, but it's such a well known failure mode that it's not much far off refusing to test restores of a bunch of tapes left in the sun for a decade.

dspillett 7 hours ago | parent | next [-]

> Pinning this squarely on user error.

It isn't user error if it was working perfectly fine until the provider made a silent change.

Unless the user error you are referring to is not managing their own backups, like I do. Though this isn't free from trouble, I once had silent failures backing up a small section of my stuff for a while because of an ownership/perms snafu and my script not sending the reports to stderr to anywhere I'd generally see them. Luckily an automated test (every now & then it scans for differences in the whole backup and current data) because it could see the source and noticed a copy wasn't in the latest snapshot on the far-away copy. Reliable backups is a harder problem then most imagine.

6 hours ago | parent [-]
[deleted]
mr_mitm 7 hours ago | parent | prev | next [-]

If there is a footgun I haven't considered yet in backup exclusions, I'd like to know more. Shouldn't it be safe to exclude $XDG_CACHE_HOME? Unfortunately, since many applications don't bother with the XDG standard, I have to exclude a few more directories, so if you have any stories about unexpected exclusions, would you mind sharing?

100ms 7 hours ago | parent [-]

I don't remember why I started doing it, but I don't bulk exclude .cache for some reason or other. I have a script that strips down larger known caches as part of the backup. But the logic, whatever it was, is easy to understand: you're relying on apps to correctly categorise what is vs. isn't cache.

Also consider e.g. ~/.cache/thumbnails. It's easy to understand as a cache, but if the thumbnails were of photos on an SD card that gets lost or immediately dies, is it still a cache? It might be the only copy of some once-in-a-lifetime event or holiday where the card didn't make it back with you. Something like this actually happened to me, but in that case, the "cache" was a tarball of an old photo gallery generated from the originals that ought to have been deleted.

It's just really hard to know upfront whether something is actually important or not. Same for the Downloads folder. Vendor goes bankrupt, removes old software versions, etc. The only safe thing you can really do is hold your nose and save the whole lot.

6 hours ago | parent | prev [-]
[deleted]