| ▲ | segmondy 6 hours ago | ||||||||||||||||||||||||||||||||||||||||||||||
you do realize claude opus/gpt5 are probably like 1000B-2000B models? So trying to have a model that's < 60B offer the same level of performance will be a miracle... | |||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | epolanski 20 minutes ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
Aren't both latest opus and sonnet smaller than the previous versions? | |||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | jrop 6 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
I don't buy this. I've long wondered if the larger models, while exhibiting more useful knowledge, are not more wasteful as we greedily explore the frontier of "bigger is getting us better results, make it bigger". Qwen3-Coder-Next seems to be a point for that thought: we need to spend some time exploring what smaller models are capable of. Perhaps I'm grossly wrong -- I guess time will tell. | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | regularfry 2 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||||||||
There is (must be - information theory) a size/capacity efficiency frontier. There is no particular reason to think we're anywhere near it right now. | |||||||||||||||||||||||||||||||||||||||||||||||