Remix.run Logo
vessenes 2 hours ago

Sovereign weights models are a good thing, for a variety of reasons, not least just encapsulating human diversity around the globe.

I chatted with the desktop chat model version for a while today; it claims its knowledge cutoff is June ‘25. It refused to say what size I was chatting with. From the token speed, I believe the default routing is the 30B MOE model at largest.

That model is not currently good. Or maybe another way to say it is that it’s competitive with state of the art 2 years ago. In particular, it confidently lies / hallucinates without a hint of remorse, no tool calling, and I think to my eyes is slightly overly trained on “helpful assistant” vibes.

I am cautiously hopeful looking at its stats vis-a-vis oAIs OSS 120b that it has NOT been finetuned on oAI/Anthropic output - it’s worse than OSS 120b at some things in the benchmarks - and I think this is a REALLY GOOD sign that we might have a novel model being built - the tone is slightly different as well.

Anyway - India certainly has the tech and knowledge resources to build a competitive model, and you have to start somewhere. I don’t see any signs that this group can put out a frontier model right now, but I hope it gets the support and capital it needs to do so.

dartharva an hour ago | parent | next [-]

> India certainly has the tech and knowledge resources to build a competitive model

In what universe? India has near-absolutely none of the expensive infra and chip stockpile needed to build frontier models that its American and Chinese counterparts have, even if it did have the necessary expertise (which I also doubt it does).

Sporktacular 2 hours ago | parent | prev | next [-]

I'd guess making this a national pride thing will just make it less diverse. Answer would be training models on broader sources, not more nationalistic models.

vessenes 2 hours ago | parent [-]

No, that will decrease diversity across the model spectrum taken as an entire population.

segmondy 2 hours ago | parent | prev [-]

You have no idea what you are talking about if you are asking the model what size it is or claiming that a model lies.

vessenes 2 hours ago | parent [-]

Please enlighten me.

wizzwizz4 an hour ago | parent [-]

Language models entirely lack introspective capacity. Expecting a language model to know what size it is is a category error: you might as well expect an image classifier to know the uptime of the machine it's running on.

Language models manipulate words, not facts: to say they "lie" suggests they are capable of telling the truth, but they don't even have a notion of "truth": only "probable token sequence according to distribution inferred from training data". (And even that goes out the window after a reinforcement learning pass.)

It would be more accurate to say that they're always lying – or "bluffing", perhaps –, and sometimes those bluffs correspond to natural language sentences that are interpreted by human readers as having meanings that correspond to actual states of affairs, while other times human readers interpret them as corresponding to false states of affairs.