| ▲ | fouric 5 hours ago | |
I'd generally agree about Deepseek being as good as Sonnet - but I have extreme trouble with prompt compliance with V4 Pro in a way that I've never had with Sonnet. I'll tell it "find the bug, but don't fix it" or "please use this tool I just developed" and it'll ignore me a high fraction of the time. It's bad enough that I'm working on guardrails at the harness level because prompting appears to be useless. Do you have the same issue? | ||
| ▲ | stavros 4 hours ago | parent [-] | |
I have Opus make a fairly detailed plan, then Deepseek implements, and GPT reviews. With that setup, I have zero issues, probably because what you mention is handled (the plan keeps it on track and the reviewer catches any issues). Now that you mention it, though, I have seen it do a few things that weren't in the plan. The reviewer caught them, though, so they didn't cause a problem, and it's so cheap that overall it's a massive improvement. | ||