Remix.run Logo
nl 7 hours ago

Opus is still significantly better than open weight models.

GLM 5.2 comes close on agentic tasks, but doesn't code as well.

Kimi 2.6 and Deepseek v4 Pro write great code but lose track when doing agentic workflows. They were better than Sonnet 4.6 but not as good as Opus. I haven't compared them to Sonnet 5 yet.

nextaccountic 3 hours ago | parent | next [-]

> GLM 5.2 comes close on agentic tasks, but doesn't code as well.

> Kimi 2.6 and Deepseek v4 Pro write great code but lose track when doing agentic workflows.

Here is an idea: can GLM 5.2 and Kimi 2.7 be combined in some way? Maybe GLM doing planning and Kimi doing coding.

atemerev 6 hours ago | parent | prev [-]

I use GLM 5.2 routinely for coding and agentic tasks. It is not without its quirks, but generally I find it on the level of Opus 4.7 or so. But without all these "cybersecurity" rejections.