Remix.run Logo
nostrademons 7 hours ago

I assume it's something like this:

Attacking website periodically makes random reads from a large file in localStorage. Other tabs and websites open have Javascript running that periodically performs operations that will result in SSD traffic. For example, GMail has a certain polling interval to check for new mail, and each request is going to result in a cache write that makes the SSD busy and delays other conflicting IO operations. Reddit checks for new chat messages. Large memory-heavy websites get paged out of RAM.

The pattern of IO operations that a website makes creates a fingerprint of interference with the IO ops that the attacking website is doing, showing up as differing amounts of latency as the SSD is periodically busy. This fingerprint can then be reconstructed to a specific website by training a CNN on it, basically using a neural net to classify a certain pattern of delays to the IO ops that other websites are doing.

In theory it makes sense, but it seems very noisy. Anything that makes absolutely zero requests or IO operations in the background (like say HN, or most old-school text sites) wouldn't show up, and would be indistinguishable from any other zero-request site. And having other sources of IOps on the same computer - say you're running an Ethereum client that's perpetually updating the blockchain, or you're downloading a bunch of torrents, or you've got DropBox and it's syncing your directory - would introduce noise that throws off the classifier.

doodlebugging 4 hours ago | parent | next [-]

That's interesting. Thanks for the explanation. If I read this right this isn't as effective against spinning HD-based systems and there is a dependence on the user maintaining more than one tab as they browse?

If that's the case then my system which is still HD-based is not threatened and since I tend to close tabs and windows and just spin up a new private window for each site while clearing cookies, etc on exit then maybe this is a non-issue for me. Or maybe just block javascript too.

nostrademons 40 minutes ago | parent [-]

It'd have some effectiveness against spinning HDDs because it's really just measuring contention for the I/O subsystem, but it'd likely be less because the kernel usually buffers writes to HDDs internally. But then, the kernel also usually buffers writes to SSDs, just with lower latency between the call and the data being written.

I don't think too highly about this particular threat vector - it seems like the kind of attack where you could perhaps get a working proof of concept going in the lab to write a paper and demonstrate some results, but actually using it to attack people at scale seems prohibitively noisy. People that close all their tabs when not at use are not at risk (and the data I had was that most people don't actually use browser tabs, they're very much a power-user feature). People who have disk-intensive other processes like Bittorrent or various file-syncing services aren't really at risk, because those other processes inject similar noise into the data stream. The signal in general seems weak because of buffering and differing SSD latency and so on.

puppycodes 7 hours ago | parent | prev [-]

Thats a good explination. It does seem extremely noisy and not at all practical for fingerprinting a user compared to other methods. If you have javascript enabled assume you can be fingerprinted.