Feels about right.

I've launched an internal demo of Claude Code and Deepseek on the same day and we burned through our monthly allowance for Claude in just over a week, with more than a half of that budget being spent in one day. With DS people are unable to go through that same amount of money in a month, not even close.

With that Claude feels like an expensive toy, while DS is a shovel, purely because developers do not feel like they are eating into a precious resource while using it. Also it does not feel like there is much of a difference in capability between Claude and DS-pro. DS-pro and flash do feel like sonnet/opus and haiku, but flash is still very-very capable.

▲

onlyrealcuzzo 16 hours ago | parent | next [-]

I rage canceled Claude today.

After 2 weeks of Claude getting progressively worse and worse, today was the final straw.

I don't care if they have a phone app. The model is COMPLETE garbage after you subscribe long enough and they think they've "got you".

I can't code on my phone if the model literally moves in the wrong direction and does the opposite of what I tell it to. If I wanted to make my code worse, I'd just randomly commit garbage. I don't need a mobile app for that.

▲

couchdb_ouchdb 14 hours ago | parent | next [-]

I've seen a lot of this sentiment over the previous six months from people on reddit. I have yet to experience this myself as a developer with over 20 years of experience.

▲

johnfink8 an hour ago | parent | next [-]

I see a lot of the "4.7 is a downgrade" sentiment. 4.7 does (mostly) what you ask it to do. 4.6 does what it thinks it should do. As someone with 20 years writing my own code I want the former, but the loud contingent online wants the latter.

When you're on a mature codebase with 500k+ lines of code, I haven't seen anything else be as effective as 4.7.

	▲	onlyrealcuzzo 9 minutes ago \| parent [-]
		I can tell you for a fact, Claude 4.7 was NOT doing what I told it to do (in fact the clear and complete opposite - repeatedly), a pretty simple architectural refactor, and that Codex did better and DeepSeek much better. It was given very simple ways to verify success. It simply didn't do that and said it's at a good stopping point, despite moving in the WRONG direction not even doing 1% of the task, and being told to see the task through to completion. Meanwhile, Codex broke it down into 3 steps and just got it done... No, "I'm going to give it to you straight, this is a large risky commit that could go sideways, so I'm just not going to do anything instead." Claude worked on it for almost 200 commits over 2 weeks, needing to typically prompt it 3x to even TRY to make any progress instead of just wasting tokens to ignore me and tell me how big and risky it is. Maybe Claude is just particularly terrible at this type of refactor. I'm not sure why that would be.

▲

fendy3002 3 hours ago | parent | prev | next [-]

As always, I think this happen more to vibe coder. They don't understand that bigger project means worse AI performance. On top of that Opus felt being nerfed at understanding prompt so if your spec is bad you won't get good result.

▲

solenoid0937 4 hours ago | parent | prev | next [-]

It's the same phenomenon as when you learn a new vocabulary word you see it everywhere.

People heard "Claude is nerfed" and now they see it everywhere, they notice failures a lot more than they would have otherwise.

Doesn't matter that Claude is not, in fact, nerfed. Perception is powerful and most humans are not rational.

▲

arkadiytehgraet 17 minutes ago | parent | next [-]

This account is an LLM-hype peddler, shilling for Anthropic (check comment history). If they say that Claude is not nerfed, then most likely it is, in fact, nerfed.

▲

fendy3002 3 hours ago | parent | prev [-]

Oh Opus is nerfed sure, but not that hard. Early this year opus 4.6 can understand your prompt and your intention easily, it got worse around mid April. Opus 4.7 even worse than that.

However that's just it, you just need to improve and make clearer of your prompt and it will perform just as good.

	▲	fragmede 3 hours ago \| parent [-]
		Or just switch over to OpenAI. Codex-5.5 is quite good.

▲

dgellow 6 hours ago | parent | prev | next [-]

Opus 4.7 has been a real downgrade for me. I’m back to mid 2025 when I had to catch all the completely intermediary goals/assumptions the model is creating for itself

	▲	kaeluka 35 minutes ago \| parent \| next [-]
		it's sort of good at thinking, writing specs, etc.. Also debugging. But as a coder: I see no advantage to opus 4.6 and I preferred sonnet most times already over opus 4.6.
	▲	Wowfunhappy 5 hours ago \| parent \| prev \| next [-]
		You can still use older versions of Opus if they work better for you. Just need to set the environment variable.
	▲	chantepierre 5 hours ago \| parent \| prev [-]
		I felt that but find it worked way better by invoking it with `claude --effort max` only

▲

colechristensen 6 hours ago | parent | prev | next [-]

What it does seem like is that they're tuning some knobs up and down or releasing new versions of models or system prompts that result in the model getting dumber and smarter in waves.

Opus has been dumb this week.

Claude was having a lot of capacity problems and downtime and then this week that has been much less obvious... and the model is dumber.

It could also just be luck and my impressions are false... who knows.

▲

Our_Benefactors 9 hours ago | parent | prev [-]

It’s because it’s not true, there’s no evidence for it that passes the sniff test. No lab is “shipping a worse model once they’ve got you”. People have a bad few days and blame the model providers instead of stepping back to fix their workflow.

	▲	raincole 6 hours ago \| parent [-]
		When it comes to something with random results (unfortunately that's what LLMs are), people will think the odds are rigged against them. It's a good thing that hype-chasers are cancelling though. So we can use the services with a reasonable latency.

▲

mmusc 14 hours ago | parent | prev [-]

All these tools have almost feature parity. The GitHub cli allows remote sessions and can run anthropic models anyway

▲

kridsdale1 16 hours ago | parent | prev [-]

Considered Gemini?

	▲	operatingthetan 16 hours ago \| parent \| next [-]
		Gemini got a big reduction in usage limits this week. There was backlash and they added 3x usage for Antigravity a day later but I haven't really tried it out to get a feel for it yet.
	▲	saulpw 16 hours ago \| parent \| prev [-]
		Google has burnt all of its goodwill in dev communities so no, I don't think Gemini is worth consideration.