| |
| ▲ | jtbayly 4 hours ago | parent | next [-] | | Reading your comments, it sounds like you are arguing it is impossible to backup files in Dropbox in any reasonable way, and therefore nobody should backup their cloud files. I know you haven’t technically said that, but that’s what it sounds like. I assume you don’t think that, so I’m curious, what would you propose positively? | | |
| ▲ | bayindirh 3 hours ago | parent [-] | | > I know you haven’t technically said that, but that’s what it sounds like. Yes, I didn't technically said that. > It sounds like you are arguing it is impossible to backup files in Dropbox in any reasonable way, and therefore nobody should backup their cloud files. I don't argue neither, either. What I said is with "on demand file download", traditional backup software faces a hard problem. However, there are better ways to do that, primary candidate being rclone. You can register a new application ID for your rclone installation for your Google Drive and Dropbox accounts, and use rclone as a very efficient, rsync-like tool to backup your cloud storage. That's what I do. I'm currently backing up my cloud storages to a local TrueNAS installation. rclone automatically hash-checks everything and downloads the changed ones. If you can mount Backblaze via FUSE or something similar, you can use rclone as an intelligent MITM agent to smartly pull from cloud and push to Backblaze. Also, using RESTIC or Borg as a backup container is a good idea since they can deduplicate and/or only store the differences between the snapshots, saving tons of space in the process, plus encrypting things for good measure. | | |
| ▲ | nine_k 2 hours ago | parent [-] | | This. You should not try to backup your local cache of cloud files as if those were your local files. Use a tool that talks to the cloud storage directly. Use tools with straightforward, predictable semantics, like rclone, or synching, or restic/Borg. (Deduplication rules, too.) |
|
| |
| ▲ | vladvasiliu 4 hours ago | parent | prev | next [-] | | But if the files are only on the remote storage and not local, chances are they haven't been modified recently, so it shouldn't download them fully, just check the metadata cache for size / modification time and let them be if they didn't change. So, in practice, you shouldn't have to download the whole remote drive when you do an incremental backup. | | |
| ▲ | bayindirh 4 hours ago | parent [-] | | You can't trust size and modification time all the time, though mdate is a better indicator, it's not foolprooof. The only reliable way will be checksumming. Interestingly, rclone supports that on many providers, but to be able to backblaze support that, it needs to integrate rclone, connect to the providers via that channel and request checks, which is messy, complicated, and computationally expensive. Even if we consider that you won't be hitting API rate limits on the cloud provider. | | |
| ▲ | NetMageSCW 3 hours ago | parent [-] | | If you can’t trust modification time you are doing something so unusual that you probably need to be handling your backups privately anyway. | | |
| ▲ | bayindirh 3 hours ago | parent [-] | | I don't think so. Sometimes modification time of a file which is not downloaded on computer A, but modified by computer B is not reflected immediately to computer A. Henceforth, backup software running on computer A will think that the file has not been modified. This is a known problem in file synchronization. Also, some applications modifying the files revert or protect the mtime of the file for reasons. They are rare, but they're there. |
|
|
| |
| ▲ | Chaosvex 3 hours ago | parent | prev [-] | | Then do it in memory, assuming those services allow you to read the files like that. It sounds like they do based on your other comments. | | |
| ▲ | bayindirh 3 hours ago | parent [-] | | The problem is, downloading files and disk management is not in your control, that part is managed by the cloud client (dropbox, google drive, et. al) transparently. The application accessing the file is just waiting akin to waiting for a disk spin up. The filesystem is a black box for these software since they don't know where a file resides. If you want control, you need to talk with every party, incl. the cloud provider, a-la rclone style. |
|
|