| ▲ | linkregister 11 hours ago |
| The common factoid raised in financial reports is GPUs used in model training will lose thermal insulation due to their high utilization. The GPUs ostensibly fail. I have heard anecdotal reports of GPUs used for cryptocurrency mining having similar wear patterns. I have not seen hard data, so this could be an oft-repeated, but false fact. |
|
| ▲ | Melatonic 11 hours ago | parent | next [-] |
| It's the opposite actually - most GPU used for mining are run at a consistent temp and load which is good for long term wear. Peaky loads where the GPU goes from cold to hot and back leads to more degradation because of changes in thermal expansion. This has been known for some time now. |
| |
| ▲ | Yizahi 10 hours ago | parent | next [-] | | That is commonly repeated idea, but it doesn't take into account countless token farms which are smaller than a datacenter. Basically anything from a single MB with 8 cards to a small shed with rigs, all of which tend to disregard common engineering practices and run hardware into a ground to maximize output until next police raid or difficulty bump. Plenty of photos in the internet of crappy rigs like that, and no one guarantees which GPU comes whom where. Another commonly forgotten issue is that many electrical components are rated by hours of operation. And cheaper boards tend to have components with smaller tolerances. And that rated time is actually a graph, where hour decrease with higher temperature. There were instances of batches of cards failing due to failing MOSFETs for example. | | |
| ▲ | Melatonic 7 hours ago | parent | next [-] | | While I'm sure there are small amateur setups done poorly that push cards to their limits this seems like a more rare and inefficient use. GPUS (even used) are expensive and running them at maximum would require large costs and time to be replacing them regularly. Not to mention the increased cost of cooling and power. Not sure I understand the police raid mentality - why are the police raiding amateur crypto mining setups ? I can totally see cards used by casual amateurs being very worn / used though - especially your example of single mobo miners who were likely also using the card for gaming and other tasks. I would imagine that anyone purposely running hardware into the ground would be running cheaper / more efficient ASICS vs expensive Nvidia GPUs since they are much easier and cheaper to replace. I would still be surprised however if most were not proritising temps and cooling | |
| ▲ | coryrc 9 hours ago | parent | prev | next [-] | | Specifically, we expect a halving of lifetime per 10K increase in temperature. | |
| ▲ | whaleofatw2022 9 hours ago | parent | prev | next [-] | | Let's also not forget the set of miners that either overclock or dont really care about long term in how they set up thermals | | |
| ▲ | belval 8 hours ago | parent [-] | | Miners usually don't overclock though. If anything underclocking is the best way to improve your ROI because it significantly reduces the power consumption while retaining most of the hashrate. | | |
| ▲ | Melatonic 7 hours ago | parent | next [-] | | Exactly - more specifically undervolting. You want the minimum volts going to the card with it still performing decently. Even in amateur setups the amount of power used is a huge factor (because of the huge draw from the cards themselves and AC units to cool the room) so minimising heat is key. From what I remember most cards (even CPUs as well) hit peak efficiency when undervolted and hitting somewhere around 70-80% max load (this also depends on cooling setup). First thing to wear out would probably be the fan / cooler itself (repasting occasionally would of course help with this as thermal paste dries out with both time and heat) | | |
| ▲ | bluGill 5 hours ago | parent [-] | | The only amatures I know doing this are trying to heat their garrage for free. so long as the heat gain is paid for they can afford to heat an otherwise unheated building. |
| |
| ▲ | zozbot234 7 hours ago | parent | prev [-] | | Wouldn't the exact same considerations apply to AI training/inference shops, seeing as gigawatts are usually the key constraint? |
|
| |
| ▲ | WalterBright 5 hours ago | parent | prev [-] | | Why would police raid a shed housing a compute center? |
| |
| ▲ | mbesto 9 hours ago | parent | prev [-] | | Source? |
|
|
| ▲ | zozbot234 11 hours ago | parent | prev | next [-] |
| > I have heard anecdotal reports of GPUs used for cryptocurrency mining having similar wear patterns. If this was anywhere close to a common failure mode, I'm pretty sure we'd know that already given how crypto mining GPUs were usually ran to the max in makeshift settings with woefully inadequate cooling and environmental control. The overwhelming anecdotal evidence from people who have bought them is that even a "worn" crypto GPU is absolutely fine. |
|
| ▲ | munk-a 11 hours ago | parent | prev [-] |
| I can't confirm that fact - but it's important to acknowledge that consumer usage is very different from the high continuous utilization in mining and training. It is credulous that the wear on cards under such extreme usage is as high as reported considering that consumers may use their cards at peak 5% of waking hours and the wear drop off is only about 3x if it is used near 100% - that is a believable scale for endurance loss. |