| ▲ | dang 5 days ago | |||||||
Just don't use genetically identical hardware: https://news.ycombinator.com/item?id=32031639 https://news.ycombinator.com/item?id=32032235 Edit: wow, I can't believe we hadn't put https://news.ycombinator.com/item?id=32031243 in https://news.ycombinator.com/highlights. Fixed now.  | ||||||||
| ▲ | hinkley 5 days ago | parent | next [-] | |||||||
I’ve seen this up close twice and I’m surprised it’s only twice. Between March and September one year, 6 people on one team had to get new hard drives in their thinkpads and rebuild their systems. All from the same PO but doled out over the course of a project rampup. That was the first project where the onboarding docs were really really good, since we got a lot of practice in a short period of time. Long before that, the first raid array anyone set up for my (teams’) usage, arrived from Sun with 2 dead drives out of 10. They RMA’d us 2 more drives and one of those was also DOA. That was a couple years after Sun stopped burning in hardware for cost savings, which maybe wasn’t that much of a savings all things considered.  | ||||||||
| ▲ | gogusrl 5 days ago | parent | prev | next [-] | |||||||
I got burnt by this bug on freakin' Christmas Eve 2020 ( https://forum.hddguru.com/viewtopic.php?f=10&t=40766 ). There was some data loss and a lot of lessons learned.  | ||||||||
| ▲ | praccu 5 days ago | parent | prev | next [-] | |||||||
Many years ago (13?), I was around when Amazon moved SABLE from RAM to SSDs. A whole rack came from a single batch, and something like 128 disks went out at once. I was an intern but everyone seemed very stressed.  | ||||||||
| ▲ | airstrike 5 days ago | parent | prev | next [-] | |||||||
I love that "Ask HN: What'd you do while HN was down?" was a thing  | ||||||||
  | ||||||||
| ▲ | Cthulhu_ 5 days ago | parent | prev [-] | |||||||
Man I hit something like that once, a SSD had a firmware bug where it would stop working at an exact number of hours.  | ||||||||