| ▲ | stinkbeetle a day ago | |
This matches what I have long said, which is that adding ECC memory to consumer devices will not result in any incredible stability improvement. It will barely be a blip really. As we know from Google and other papers, most of these 10% of flips will be caused by broken or marginal hardware, of which a good proportion of which could be weeded out by running a memory tester for a while. So if you do that you're probably looking a couple out of every hundred crashes being caused by bitflips in RAM. A couple more might be due to other marginal hardware. The vast majority software. How often does your computer or browser crash? How many times per year? About 2-3 for me that I can remember. So in 50 years I might save myself one or two crashes if I had ECC. ECC itself takes about 12.5% overhead/cost. I have also had a couple of occasions where things have been OOM-killed or ground to a halt (probably because of memory shortage). Could be my money would be better spent with 10% more memory than ECC. People like to rave and rant at the greedy fatcats in the memory-industrial complex screwing consumers out of ECC, but the reality is it's not free and it's not a magical fix. Not when software causes the crashes. Software developers like Linus get incredibly annoyed about bug reports caused by bit flips. Which is understandable. I have been involved in more than one crazy Linux kernel bug that pulled in hardware teams bringing up new CPU that irritated the bug. And my experience would be far from unique. So there's a bit of throwing stones in glass houses there too. Software might be in a better position to demand improvement if they weren't responsible for most crashes by an order of magnitude... | ||