| ▲ | jauntywundrkind 8 hours ago | ||||||||||||||||
The potential here with High-Bandwidth Flash is super cool. Effectively trying to go from 8 or a dozen flash channels to having a hundred or hundreds of channels would be amazing: > The KAIST professor discussed an HBF unit having a capacity of 512 GB and a 1.638 TBps bandwidth. One weird thing about this would be that it's still NAND flash and NAND flash still has limited read/write cycles, often measured in the thousands (Drive-Writes-a-Day across 5 years). If you can load a model & just keep querying it, that's not a problem. Maybe it's small enough to not be so bad, but my gut is that writing context here too might present difficulty. | |||||||||||||||||
| ▲ | digiown 8 hours ago | parent [-] | ||||||||||||||||
I assume the use case is that you are an inference provider, and you put a bunch of models you might want to serve in the HBF to be able to quickly swap them in and out on demand. | |||||||||||||||||
| |||||||||||||||||