| ▲ | Agingcoder 11 hours ago | |||||||||||||||||||||||||
Well my admins eventually believed me , so I’m fairly comfortable with what I said. We also had a few thousands of physical servers with about of terabyte of ram each. You are right : we did see repaired errors, but we also saw (indirectly, and after testing ) unrepaired ones | ||||||||||||||||||||||||||
| ▲ | RealityVoid 9 hours ago | parent [-] | |||||||||||||||||||||||||
Ok, I am sure there is _some_ amount of unrepairable errors. But the initial discussion was that ECC ram makes it go away and your point that it doesn't. And the vast vast majority of the errors, according to my understanding and to the paper you pointed to, are repairable. About 1 out of 400 ish errors are non-repairable. That's a huge improvement! If you had ECC ram, the failures Firefox sees here would drop from 10% to 0.025%! That is highly significant! Even more! 2 bit errors now you would be informed of! You would _know_ what is wrong. You could have 3(!) bit errors and this you might not see, but they'd be several orders of magnitude even rarer. So yes, it would not 100% go away, but 99.9 % go away. That's... Making it go away in my book. And last but not least, this paper mentions uncorrectable errors. It says nothing of undetectable ecc errors! You said _undetectable_ errors. I'm sure they happen, but would be surprised if you have any meaningful incidence of this, even at terabytes of data. It's probay on the order of 0.000625 of errors you can get ( but if you want I can do more solid math) | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||