Remix.run Logo
nothinkjustai 13 hours ago

Because of marketing and vibes mostly.

Heck I prefer DeepSeek to both of those.

mcv 3 hours ago | parent | next [-]

I feel you. I'd prefer to stick entirely with local open source models. I tried using Aider and Qwen last week, and while it's still impressive what it can do with just local resources and entirely for free, its error rate is too high, and it's clearly not remotely in the same league as Claude Code.

josephg 12 hours ago | parent | prev | next [-]

Wow, I'm really surprised. I tried deepseek (their best model, through the official API). Its extremely cheap, but its clearly not as good at programming as Opus 4.7. It seems nowhere near as good at making high level design choices. Deepseek also seems to get stuck in whack-a-mole fixing loops much more than opus. I stopped it at one point, and asked opus to solve the problem it was trying to solve and it saw the solution immediately.

I was running deepseek through claude's code agent harness. Maybe it works better through a different tool?

zmmmmm 12 hours ago | parent | next [-]

I've given V4 Pro some curly things and I was impressed at how it figured them out. I agree high level design is not its forte. But it sat in a loop and dogmatically debugged a crazy dependency issue to come to the right answer over the course of 15 minutes which impressed me.

nothinkjustai 10 hours ago | parent | prev | next [-]

Idk, I don’t vibe code so even the flash model is great for generating code for myself. I tend to do the planning and design myself though.

Harness also matters, and also provider. I was using openrouter and switched to the Deepseek api and suddenly all the tool call issues I was having resolved themselves. Flash is so damn fast at doing stuff like generating boilerplate I can’t go back to the bigger slower models.

esafak 12 hours ago | parent | prev [-]

You tried v4?

codybontecou 12 hours ago | parent | next [-]

I tried to like it, but it eventually got stuck in a near-infinite loop trying to debug an extra curly bracket in an iOS app.

That and the lack of image-read support surprised me. I'm a big fan of feeding screenshots into my llm and that killed it for me.

josephg 12 hours ago | parent | prev [-]

Yeah, v4.

I would have been much more impressed with v4 about 6 months ago. But I've been spoiled by opus 4.7. Deepseek isn't at the same level.

zmmmmm 13 hours ago | parent | prev [-]

interestingly I had the same experience, and weirdly it's in part because it is clearly less intelligent. It's more of a mechanistic tool just doing what I ask (but still very smart and very competent about it) and less trying to win a nobel prize with each answer. Turns out I actually like that.