Remix.run Logo
edg5000 4 hours ago

Has anybody used V4 hard, for the most challenging tasks (agentically, locally)? It's so hard to compare without putting serious time in it. Like spending a year daily with the model.

Oras 3 hours ago | parent [-]

I tried it for two tasks using Claude Code, on max effort.

1. Web platform, asking it to analyse a feature to create reports, and coming up with better solution and better UX. it did great, I would say on par with Sonnet 4.6 or even opus considering the thinking and explanation

2. Mac app with some basic functionality, it did well from functional perspective but then I used Opus 4.7 to evaluate and suggest improvements, where I noticed it missed many vital points in design system and usability.

I think it’s a leap, I haven’t used a model this capable that is not OpenAI or Anthropic

kroaton an hour ago | parent [-]

Claude Code poisons non-anthropic models in usage. We found this out when the code was leaked. Use a fork or OpenCode/pi-coding-agent

Oras an hour ago | parent | next [-]

Mind sending where you found this in the leaked code?

swader999 an hour ago | parent | prev [-]

By poisons, do you mean it degrades their quality of output somehow?