Remix.run Logo
aforwardslash a day ago

ECC are traditionally slower, quite more complex, and they dont completely eliminate the problem (most memories correct 1 bit per word and detect 2 bits per word). They make sense when environmental factors such as flaky power, temperature or RF interference can be easily discarded - such as a server room. But yeah, I agree with you, as ECC solves like 99% of the cases.

indolering a day ago | parent | next [-]

Being able to detect these issues is just as important as preventing them.

aforwardslash a day ago | parent [-]

Thing is, every reported bug can be a bit flip. You can actually in some cases have successful execution, but bitflips in the instrumentation reporting errors that dont exist.

russdill 16 hours ago | parent | prev | next [-]

The amount of overhead a few bits of ECC has is basically a rounding error, and even then, the only time the hardware is really doing extra work is when bit errors occur and correction has to happen.

The main overhead is simply the extra RAM required to store the extra bits of ECC.

jeffbee a day ago | parent | prev [-]

ECC are "slower" because they are bought by smart people who expect their memory to load the stored value, rather than children who demand racing stripes on the DIMMs.

matja 16 hours ago | parent | next [-]

The actual RAM chips on a ECC DIMM are exactly the same as a non-ECC DIMM, there's just an extra 1/2/4 chips to extend to 72 bit words.

The main reason ECC RAM is slower is because it's not (by default) overclocked to the point of stability - the JEDEC standard speeds are used.

The other much smaller factors are:

* The tREFi parameter (refresh interval) is usually double the frequency on ECC RAM, so that it handles high-temperature operation. * Register chip buffers the command/address/control/clock signals, adding a clock of latency the every command (<1ns, much smaller than the typical memory latency you'd measure from the memory controller) * ECC calculation (AMD states 2 UMC cycles, <1ns).

Dylan16807 19 hours ago | parent | prev | next [-]

ECC keeps your bits safe from random flips to a ridiculously large factor. You can run the memory at high consumer speeds, giving up some of that safety margin, while still being more reliable than everything else in your computer.

And there's non-random bit errors that can hit you at any speed, so it's not like going slow guarantees safety.

undersuit a day ago | parent | prev | next [-]

ECC is actually slower. The hardware to compute every transaction is correct does add a slight delay, but nothing compared to the delay of working on corrupted data.

throwaway85825 a day ago | parent | prev [-]

There's just no demand for high speed ECC aside from a few people making their own dimms.