Remix.run Logo
wkat4242 a day ago

This article is so dumb. It totally ignores the memory price explosion that will make large fast memory laptops unfeasible for years and states stuff like this:

> How many TOPS do you need to run state-of-the-art models with hundreds of millions of parameters? No one knows exactly. It’s not possible to run these models on today’s consumer hardware, so real-world tests just can’t be done.

We know exactly the performance needed for a given responsiveness. TOPS is just a measurement independent from the type of hardware it runs on..

The less TOPS the slower the model runs so the user experience suffers. Memory bandwidth and latency plays a huge role too. And context, increase context and the LLM becomes much slower.

We don't need to wait for consumer hardware until we know much much is needed. We can calculate that for given situations.

It also pretends small models are not useful at all.

I think the massive cloud investments will put pressure away from local AI unfortunately. That trend makes local memory expensive and all those cloud billions have to be made back so all the vendors are pushing for their cloud subscriptions. I'm sure some functions will be local but the brunt of it will be cloud, sadly.

dcreater 7 hours ago | parent | next [-]

Horrible article. Low effort, low knowledge. Had no idea the bar was so low for an IEEE publication

layer8 7 hours ago | parent | prev | next [-]

The article is from mid-November (and probably was written even earlier), where the RAM price explosion wasn’t as striking yet.

vegabook a day ago | parent | prev [-]

also, state of the art models have hundreds of _billions_ of parameters.

omneity a day ago | parent [-]

It tells you about their ambitions..