| ▲ | nickpsecurity 2 days ago | |||||||||||||||||||||||||||||||
What needs to happen is for companies (or individuals) tired of that to pool money together to build new, memory products. Then, sell them to consumers first and for non-AI use. If not that, then round-robin scheduling of quantities so the units are spread around more. If costs are high, they might reserve a certain percentage for big business at market prices (or just under) to cover the chip's mask costs. After DDR5+ RAM, then GDDR5-6 RAM for use with AI accelerators. They might try to jump right in on a HBM alternative. That could be the percentage for AI buyers I just mentioned. Especially if they could put 40-80GB on accelerators like Intel ARC's. If successful enough, they license MIPS' gaming GPU's to combine with this stuff with full, open-source stack and RTOS support for military sales. | ||||||||||||||||||||||||||||||||
| ▲ | Tuna-Fish a day ago | parent [-] | |||||||||||||||||||||||||||||||
Time for my daily "HBF is coming" comment. The next step for models is to put the weights on flash, connected with a very wide interface to the accelerator. The first users will be datacenters, but it should trickle down to consumer hardware eventually. A single 512GB stack is expected to cost about $200, and provide 1.6TB/s of reads. You still need some fast DRAM for the KV cache and for activations, but weights should be sitting on flash. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||