Remix.run Logo
fastball 6 days ago

It's very relevant if any other EU firm can take open models (regardless of provenance) and fine tune them in the same way. Mistral really needs to be producing at-or-near SOTA models for them to be differentiated at all, and they are not.

kergonath 5 days ago | parent [-]

Not even then. You need to compare the end products, which are not the open weight models.

I don’t care whether the LLM can have "PhD level thoughts" (lol) or is able to code golf like a Facebook engineer. It needs to be able to do its task (so all the infrastructure around the model matters just as much as the model itself) efficiently (so small models have an advantage). There are billions of weights in general-purpose models that are irrelevant for specialised uses.

The way to go is efficient models adapted to their task. It’s exactly the same thing as for industrial robots. Geeks get excited every now and then about humanoid robots, but in the real life we don’t need robots to stand on two legs or our LLM to cite Shakespeare.