Remix.run Logo
mtct88 10 hours ago

Nice release from the Qwen team.

Small openweight coding models are, imho, the way to go for custom agents tailored to the specific needs of dev shops that are restricted from accessing public models.

I'm thinking about banking and healthcare sector development agencies, for example.

It's a shame this remains a market largely overlooked by Western players, Mistral being the only one moving in that direction.

lelanthran 10 hours ago | parent | next [-]

> It's a shame this remains a market largely overlooked by Western players, Mistral being the only one moving in that direction.

I've said in a recent comment that Mistral is the only one of the current players who appear to be moving towards a sustainable business - all the other AI companies are simply looking for a big payday, not to operate sustainably.

gunalx 6 hours ago | parent [-]

Metawith the llama series as well,they just didn't manage to keep upping the game after and with llama4.

Aurornis 9 hours ago | parent | prev | next [-]

I play with the small open weight models and I disagree. They are fun, but they are not in the same class as hosted models running on big hardware.

If some organization forbade external models they should invest in the hardware to run bigger open models. The small models are a waste of time for serious work when there are more capable models available.

NitpickLawyer 10 hours ago | parent | prev | next [-]

I agree with the sentiment, but these models aren't suited for that. You can run much bigger models on prem with ~100k of hardware, and those can actually be useful in real-world tasks. These small models are fun to play with, but are nowhere close to solving the needs of a dev shop working in healthcare or banking, sadly.

kennethops 10 hours ago | parent | prev | next [-]

I love the idea of building competitor to open weight models but damn is this an expensive game to play

smrtinsert 10 hours ago | parent | prev [-]

How true is this? How does a regulated industry confirm the model itself wasn't trained with malicious intent?

ndriscoll 10 hours ago | parent [-]

Why would it matter if the model is trained with malicious intent? It's a pure function. The harness controls security policies.

coppsilgold 6 hours ago | parent [-]

Much like a developer can insert a backdoor as a "bug" so can an LLM that was trained to do it.

One way you could probably do it is by identifying a commonly used library that can be misused in a way that would allow some kind of time-of-check to time-of-use (TOCTOU) exploit. Then you train the LLM to use the library incorrectly in this way.