| ▲ | cubefox 3 hours ago | ||||||||||||||||
I'm confused, that doesn't make sense to me: > They largely come from hyperscalers who want hard drives for their AI data centers, for example to store training data on them. What type of training data? LLMs need relatively little of that. For example, DeepSeek-V3 [1], still a relatively large model: > We pre-train DeepSeek-V3 on 14.8 trillion diverse and high-quality tokens At 2 bytes per token, that's 29.6 terabytes. That's basically nothing compared to the amount of 4K content that is uploaded to YouTube every day. | |||||||||||||||||
| ▲ | citrin_ru 5 minutes ago | parent | next [-] | ||||||||||||||||
A few random thoughts: There are many new data-centers they are being filled with servers. Most servers have at least 2 HDD (mirror) for the OS. I would not be surprised if on a huge scale even 2 HDD per server could cause HDD shortage. There are likely models which are trained on 4k video and it should be stored somewhere too. Even things like logs and metrics can consume petabytes for a large (and complex) cluster. And the less mature the software the more logs you need to debug it in production. In the AI race investments if not unlimited at least abundant. In such conditions optimization of hardware usage is the waste of time and velocity is the only things which matters. | |||||||||||||||||
| ▲ | Jach 2 hours ago | parent | prev | next [-] | ||||||||||||||||
You may have answered your own question if they're wanting to train models on video and other media. | |||||||||||||||||
| ▲ | greatgib 2 hours ago | parent | prev [-] | ||||||||||||||||
Honestly looks highly suspicious to me. Because ok they might need some big storage like petabits. But how can this be a match in proportion with the capacity that is currently usually needed for everything that is hard drive hungry. Any cloud service, any storage service, all the storage needed for private photo/video/media storage for everything that is produced everyday, for all consumer hardwares like computers... Gpu I understand but hard drive looks excessive. It's like if tomorrow there is a shortage of computer cabling because ai datacenter needs some. | |||||||||||||||||
| |||||||||||||||||