Remix.run Logo
unshavedyak 5 hours ago

It's pretty funny, i'm a $200/m Claude subscriber and i've had little need to use anything else. However the more Claude has been restricting my workflow (notably around the recent IDE/-p usage change) the more i've been wanting to go elsehwere.

I'm concerned since i really want SOTA reasoning, but DeepSeek still has me interested.

Alifatisk 4 hours ago | parent | next [-]

> I'm concerned since i really want SOTA reasoning

I think you should give other models a try and see how much they differ from SOTA models. I did this and realized, even Qwen-2.5-Max was enough. I am sure even Claude Sonnet 3.5 is enough for things I play around with. I am not really striving for fields medal in Mathematics.

unshavedyak 2 hours ago | parent [-]

That's fair, neither am i - i do tend to work in large, complex, full of legacy decision based codebases. Eg i have access to Sonnet (of course), but i choose to solely work in Opus because i find its output reads better, analyzes better, etc.

The "cost" is dumb models is just so high for me. Eg every bad decision they make increases my frustration quite a bit. Despite putting a lot of effort into my workflow to help reduce the number of decisions they make, they always will. So my hedge is always against that.. trying to reduce how insane they can be heh.

0xbadcafebee 4 hours ago | parent | prev | next [-]

You should definitely stick to the $200 plan, and not try the $10 coding plans with open weight models and higher limits. Anthropic needs your money to stay solvent, and you'll sleep better knowing you're using SOTA.

port11 39 minutes ago | parent [-]

(Zero reason to defend Anthropic.)

I’ve gone that route. I really wanted to stop using Claude, but Deepseek v4 Pro and Kimi 2.6 didn’t do the job. For a lot of coding tasks or well-specced plans, maybe… but then that’s a plan made by Opus anyway.

Even Sonnet is sometimes not worth the trouble. Opus is very thorough and reviews its own mistakes quite well. Catches a lot of edge cases.

I’m not saying we shouldn’t try other things — I did! —, but it’s more or less okay that people just like Claude Code subscriptions? The back and forth I had with Kimi on a small feature came out to ~1.8€, which is 10% of my Claude subscription each month. And that was a single session. CC with Serena uses tokens fairly well.

gck1 3 hours ago | parent | prev | next [-]

I gave a fairly complex reverse engineering task to DS-4 xhigh and GPT-5.5 xhigh today.

After about 6 hours, both ultimately failed to fully RE, however, there were some drastic differences:

DS stopped every 30 minutes or so, saying it did full RE and it should all work now, while in fact, it didn't complete even 1% of it. It also looked for shortcuts again and again, despite me prompting heavily that the specific shortcut may not be used. It was a complete and utter failure.

GPT-5.5, on the other hand, blew me away. It just did the right things, didn't jump to next steps until it was sure it completed the initial layers and had a full understanding of what's required. The only time I prompted it during the 6 hours was when I saw it going in the right direction and I could nudge it slightly towards an even better way. I never felt I was fighting it. Okay, maybe a little bit - after compaction, it sometimes would go on a "no I'm not helping you with reverse engineering" tangent, but it would resolve in a clean session.

I cancelled my Claude subscription a month ago, so I haven't tested that, but DeepSeek has reminded me a lot of how I worked with Opus 4.6/4.7. Which perhaps could be a positive sign to some, but GPT-5.5 showed me that the way claude/ds work is just way too annoying.

ttul 2 hours ago | parent | next [-]

What you’re experiencing is the difference in model intelligence. Most models can seem pretty good at simple stuff over short time horizons. Complex work requires that more intelligence be stuffed into those trillion-dimensional spaces.

cmrdporcupine 30 minutes ago | parent | prev [-]

The GPT models are heavily biased to a more incremental, empirical, evidence based approach. Sometimes to a fault. I prefer them for this reason, but it requires coaxing or strategic use of /goal to break it out if its highly staged, one piece at a time, approach.. if you don't like it.

I suspect for people doing more... website ... type development, the more "yeet this into existence" style of Opus feels preferable.

With Claude I was constantly jamming my finger on the escape key "wait, you did what?! based on what proof?!"

KronisLV 3 hours ago | parent | prev | next [-]

> i've been wanting to go elsehwere.

There's always the option of using Anthropic's models for some tasks like planning and then just hand over the implementation task to something like DeepSeek. Across different tools, a Markdown plan works pretty okay. That's what I'm planning to do if I go from the 5x Max subscription down to the Pro.

I am also writing a launcher that makes using 3rd party providers with Claude Code easy (https://ccode.kronis.dev) and I already have a local proxy up and running, just not dynamic model switching yet. Though it shouldn't be too hard to add, will probably be there within a week or two, depending on my schedule.

I don't think it's wise to leave Anthropic altogether because their models are great (and a subscription gives you features like Remote Control which I like), but switching tiers and maybe saving a bit of money seems viable! On the other hand, you do need a quality baseline, because I remember using Cerebras with GLM 4.6 way back and there was a bit too much slop.

logicchains 4 hours ago | parent | prev [-]

If you want SOTA reasoning you should be using GPT 5.5 Pro.

unshavedyak 3 hours ago | parent | next [-]

This is fair, but i've found the different models to have different moods and require different interactions to get them to stick to just the specific edits i ask for, etc.

I used to surf the three big players frequently and got really tired of the effort needed to steer some models. In the end i ended up sticking with Claude because it required less steering effort. While not strictly reasoning, a models ability to follow clear directions consistently is something i'd consider part of its SOTA capabilities.

Eventually i just tired of exploring. I just want stability.

Which ironically is why i'm thinking about moving from Claude. The very basic IDE/-p usage getting removed from my plan is a UX stability issue. I'm trying to progressively improve my workflows and efficiency, not have to establish a new foundation anytime something shifts. Quite frustrating.

auggierose 4 hours ago | parent | prev [-]

Codex has only GPT 5.5