Remix.run Logo
pstuart a day ago

My poorly informed hope is that that we can have mixture of experts with highly tuned models on areas of focus. If I'm coding in language Foo, I only care about a model that understands Foo and its ecosystem. I imagine that should be self-hostable now.

tsimionescu a day ago | parent | next [-]

A model that only understands, say, Java is useless : you need a model that understands English and some kind of reasoning and has some idea of how the human world works, and also knows Java. The vast majority of the computational effort is spent on the first two, the second is almost an afterthought. So, a model that can only program in Java is not going to be meaningfully smaller than a model that can program in ~all programming languages.

exe34 a day ago | parent | prev [-]

my suspicion is that this is not how intelligence works. creativity comes from cross breeding ideas from many domains.

pstuart 14 hours ago | parent [-]

Sure, but in the context I was considering, creativity itself wasn't a concern.

For coding, creativity is not necessarily a good thing. There are well established patterns, algorithms, and applications could reasonably be construed as "good enough" to assist with the coding itself. Adding a human language model over that to understand the user's intents could be considered an overlay on the coding model.

I confess that this is willful projection of my hope to be able to self-host agents on affordable hardware. A frontier model on powerful hardware would always be preferable but sometimes "good enough" is just that.

exe34 2 hours ago | parent [-]

I want to self-host too, but I've spent the last few weeks playing with Claude code on my hobby projects - it solves abstract problems with code, and gives actionable reviews, whereas qwen code with qwen3-coder-480 seems to just write simple code and gives generic feedback.