Has anybody used V4 hard, for the most challenging tasks (agentically, locally)? It's so hard to compare without putting serious time in it. Like spending a year daily with the model.

▲

Oras 3 hours ago | parent [-]

I tried it for two tasks using Claude Code, on max effort.

1. Web platform, asking it to analyse a feature to create reports, and coming up with better solution and better UX. it did great, I would say on par with Sonnet 4.6 or even opus considering the thinking and explanation

2. Mac app with some basic functionality, it did well from functional perspective but then I used Opus 4.7 to evaluate and suggest improvements, where I noticed it missed many vital points in design system and usability.

I think it’s a leap, I haven’t used a model this capable that is not OpenAI or Anthropic

▲

kroaton an hour ago | parent [-]

Claude Code poisons non-anthropic models in usage. We found this out when the code was leaked. Use a fork or OpenCode/pi-coding-agent

	▲	Oras an hour ago \| parent \| next [-]
		Mind sending where you found this in the leaked code?
	▲	swader999 an hour ago \| parent \| prev [-]
		By poisons, do you mean it degrades their quality of output somehow?