Wrong layer.

SSDs know which blocks have been written to a lot, have been giving a lot of read errors before etc., and often even have heterogeneous storages (such as a bit of SLC for burst writing next to a bunch of MLC for density).

They can spend ECC bits much more efficiently with that information than a file system ever could, which usually sees the storage as a flat, linear array of blocks.

▲

adrian_b 15 hours ago | parent | next [-]

This is true, but nevertheless you cannot place your trust only in the manufacturer of the SSD/HDD, as I have seen enough cases when the SSD/HDD reports no errors, but nonetheless it returns corrupted data.

For any important data you should have your own file hashes, for corruption detection, and you should add some form of redundancy for file repair, either with a specialized tool or simply by duplicating the file on separate storage media.

A database with file hashes can also serve other purposes than corruption detection, e.g. it can be used to find duplicate data without physically accessing the archival storage media.

	▲	lxgr 15 hours ago \| parent [-]
		Verifying at higher layers can be ok (it's still not ideal!), but trying to actively fix things below that are broken usually quickly becomes a nightmare.

▲

DeepSeaTortoise 14 hours ago | parent | prev [-]

IMO it's exactly the right layer, just like for ECC memory.

There's a lot of potential for errors when the storage controller processes and turns the data into analog magic to transmit it.

In practice, this is a solved problem, but only until someone makes a mistake, then there will be a lot of trouble debugging it between the manufacturer certainly denying their mistake and people getting caught up on the usual suspects.

Doing all the ECC stuff right on the CPU gives you all the benefits against bitrot and resilience against all errors in transmission for free.

And if all things go just right we might even be getting better instruction support for ECC stuff. That'd be a nice bonus

▲

lxgr 14 hours ago | parent | next [-]

> There's a lot of potential for errors when the storage controller processes and turns the data into analog magic to transmit it.

That's a physical layer, and as such should obviously have end-to-end ECC appropriate to the task. But the error distribution shape is probably very different from that of bytes in NAND data at rest, which is different from that of DRAM and PCI again.

For the same reason, IP does not do error correction, but rather relies on lower layers to present error-free datagram semantics to it: Ethernet, Wi-Fi, and (managed-spectrum) 5G all have dramatically different properties that higher layers have no business worrying about. And sticking with that example, once it becomes TCP's job to handle packet loss due to transmission errors (instead of just congestion), things go south pretty quickly.

▲

johncolanduoni 11 hours ago | parent [-]

> And sticking with that example, once it becomes TCP's job to handle packet loss due to transmission errors (instead of just congestion), things go south pretty quickly.

Outside of wireless links (where FEC of some degree is necessary regardless) this is mostly because TCP’s checksum is so weak. QUIC for example handles this much better, since the packet’s authenticated encryption doubles as a robust error detecting code. And unlike TLS over TCP, the connection is resilient to these failures: a TCP packet that is corrupted but passes the TCP checksum will kill the TLS connection on top of it instead of retransmitting.

	▲	lxgr 11 hours ago \| parent [-]
		Ah, I meant go south in terms of performance, not correctness. Most TCP congestion control algorithms interpret loss exclusively as a congestion signal, since that's what most lower layers have historically presented to it. This is why newer TCP variants that use different congestion signals can deal with networks that violate that assumption better, such as e.g. Starlink: https://blog.apnic.net/2024/05/17/a-transport-protocols-view... Other than that, I didn't realize that TLS has no way of just retransmitting broken data without breaking the entire connection (and a potentially expensive request or response with it)! Makes sense at that layer, but I never thought about it in detail. Good to know, thank you.

▲

johncolanduoni 12 hours ago | parent | prev [-]

ECC memory modules don’t do their own very complicated remapping from linear addresses to physical blocks like SSDs do. ECC memory is also oriented toward fixing transient errors, not persistently bad physical blocks.