Remix.run Logo
hbbio 6 hours ago

Strange that they apparently raised $169M (really?) and the website looks like this. Don't get me wrong: Plain HTML would do if "perfect", or you would expect something heavily designed. But script-kiddie vibe coded seems off.

The idea is good though and could work.

ACCount37 6 hours ago | parent [-]

Strange that they raised money at all with an idea like this.

It's a bad idea that can't work well. Not while the field is advancing the way it is.

Manufacturing silicon is a long pipeline - and in the world of AI, one year of capability gap isn't something you can afford. You build a SOTA model into your chips, and by the time you get those chips, it's outperformed at its tasks by open weights models half their size.

Now, if AI advances somehow ground to a screeching halt, with model upgrades coming out every 4 years, not every 4 months? Maybe it'll be viable. As is, it's a waste of silicon.

small_model 6 hours ago | parent | next [-]

Poverty of imagination here, plenty uses of this and its a prototype at this stage.

ACCount37 6 hours ago | parent [-]

What uses, exactly?

The prototype is: silicon with a Llama 3.1 8B etched into it. Today's 4B models already outperform it.

Token rate in five digits is a major technical flex, but, does anyone really need to run a very dumb model at this speed?

The only things that come to mind that could reap a benefit are: asymmetric exotics like VLA action policies and voice stages for V2V models. Both of which are "small fast low latency model backed by a large smart model", and both depend on model to model comms, which this doesn't demonstrate.

In a way, it's an I/O accelerator rather than an inference engine. At best.

MITSardine 5 hours ago | parent | next [-]

With LLMs this fast, you could imagine using them as any old function in programs.

leoedin 6 hours ago | parent | prev [-]

Even if this first generation is not useful, the learning and architecture decisions in this generation will be. You really can't think of any value to having a chip which can run LLMs at high speed and locally for 1/10 of the energy budget and (presumably) significantly lower cost than a GPU?

If you look at any development in computing, ASICs are the next step. It seems almost inevitable. Yes, it will always trail behind state of the art. But value will come quickly in a few generations.

xav_authentique 5 hours ago | parent | prev [-]

maybe they're betting on improvement in models to plateau, and that having a fairly stablized capable model that is orders of magnitude faster than running on GPU's can be valuable in the future?