| ▲ | 2ndorderthought 2 hours ago | |||||||||||||||||||||||||||||||||||||
It's pretty close already. Check qwen3.6 27b if you haven't already. People are vibe and agentic coding with it on a single GPU. It is more finicky than Claude but if you hand hold it a bit it's crazy. | ||||||||||||||||||||||||||||||||||||||
| ▲ | iugtmkbdfil834 a minute ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||
Eh. It is good in terms of results ( accuracy, good recommendations and so on ), but slow when it comes to actual inference. On local 128gb machine, it took over 5 minutes to brainstorm garage door opening mechanism with some additional restrictions for spice. | ||||||||||||||||||||||||||||||||||||||
| ▲ | gchamonlive an hour ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||
I see that going around, and either the test cases are too simplistic or I'm doing something wrong. I have a server with a 3090 in it, enough to run qwen3.6, but I haven't had much luck using it with either codex or oh-my-pi. They work, but the model gets really slow with ~64k context and the attention degrades quickly. You'll sometimes execute a prompt, the model will load a test file and say something like "I was presented with a test file but no command. What should I do with it?". So yeah, while it's true that qwen3.6 is good for agentic coding, it's not very good for exploring the codebase and coming up with plans. You need to pair it today with a model capable of ingesting the whole context and providing a detailed plan, and even then the implementation might take 10x the amount of time it'd take for sonnet or Gemini 3 to crunch through the plan. | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||