| ▲ | Kirby64 an hour ago | |
> They expect to scale up to real LLMs by splitting a model layer-wise across several chips, which they can do without incurring any throughput penalty. I’ll patiently wait to see this in reality. Their demonstration hardware is a 250W chip that is enormous in die area for the model size. They’re making a lot of claims, but until they can deliver then it’s nearly vaporware in my view. I’d be happy to be proven wrong, but I think they’re going to quickly run into hardware realities quite soon if they think they can just chain a bunch of chips together to achieve the same performance on larger sizes. | ||