I feel like I'm using Claude Opus pretty effectively and I'm honestly not running up against limits in my mid-tier subscriptions. My workflow is more "copilot" than "autopilot", in that I craft prompts for contained tasks and review nearly everything, so it's pretty light compared to people doing vibe coding.

The market-leading technology is pretty close to "good enough" for how I'm using it. I look forward to the day when LLM-assisted coding is commoditized. I could really go for an open source model based on properly licensed code.

▲ Retr0id 8 hours ago | parent | next [-]

I also use it this way and I'm overall pretty happy with it, but it feels like they really want us to use it in "autopilot" mode. It's like they have two conflicting priorities of "make people use more tokens so we can bill them more" and "people are using more tokens than expected, our pricing structure is no longer sustainable"

(but I guess they're not really conflicting, if the "solution" involves upgrading to a higher plan)

▲

fluidcruft 8 hours ago | parent | next [-]

I feel like they are making it harder to use it this way. Encouraging autonomous is one thing, but it really feels more like they are handicapping engaged use. I suspect it reflects their own development practices and needs.

▲

freedomben 8 hours ago | parent | next [-]

This is something I've thought of as well. The way the caps are implemented, it really disincentivizes engaged use. The 5-hour window especially is very awkward and disruptive. The net result is that I have to somewhat plan my day around when the 5-hour window will affect it. That by itself is a powerful disincentive from using Claude. It has also caused me to use different tools for things I previously would have used Claude for. For example, detailed plans I use codex now rather than Claude, because I hit the limit way too fast when doing documentation work. It certainly doesn't hurt that codex seems to be better at it, but I wouldn't even have a codex subscription if it wasn't for claude's usage limits

▲

j3g6t 2 hours ago | parent | next [-]

Wow, weird to see someone mirror my experience so closely. At the $100 plan my day was being warped around how to maximise multple 5 hour sessions so that it felt worth it. Dropped down to the $20 plan and stopped playing the game as I know I'll just consume the weekly usage in the few days I have free. Meanwhile codex gave me a free month, their 5HourUsageWindow:WeeklyUsageWindow ratio feels way better balanced and it gets may more work done from it. Similar to you, any task involving reading/reviewing docs [or code reviews] now insta-nukes claude's usage. My record is 12 minutes so far...

▲

Retr0id 8 hours ago | parent | prev [-]

Another big one for me is that they dropped the cache TTLs. It is normal for me to come back to a session an hour later, but someone "autopilot"-ing won't have such gaps.

▲

p_stuart82 7 hours ago | parent [-]

not just the cache though. every time you stop and come back, it basically reloads the whole session. if you just let it keep going, it counts like one smooth run. you hit the wall faster for actually checking its work.

	▲	fluidcruft 3 hours ago \| parent [-]
		It was probably the bug about cache getting purged after 5min rather than 1hour. You can review things pretty well within an hour. 5min is a real crunch. 5min doesn't mix with multitasking or getting interrupted.

▲

8 hours ago | parent | prev [-]

[deleted]

▲

dandaka 7 hours ago | parent | prev | next [-]

autopilot (yolo mode) is amazing and feels great, truly delegate instead of hand-holding on every step

▲

dutchCourage 7 hours ago | parent | next [-]

Do you have any good resources on how to work like that? I made the move from "auto complete on steroids" to "agents write most of my code". But I can't imagine running agents unchecked (and in parallel!) for any significant amount of time.

	▲	sroerick 6 hours ago \| parent \| next [-]
		Right now, I'm finding a decent rhythm in running 10-20 prompts and then kind of checking the results a few different ways. I'll ask the agent to review the code, I'll go through myself, I'll do some usability and gut checks. This seems to be a good window where I can implement a pretty large feature, and then go through and address structural issues. Goofy thinks like the agent adding an extra database, weird fallback logic where it ends up building multiple systems in parallel, etc. Currently, I find multiple agents in parallel on the same project to be not super functional. Theres just a lot of weird things, agents get confused about work trees, git conflicts abound, and I found the administrative overhead to be too heavy. I think plenty of people are working on streamlining the orchestration issue. In the mean time, I combat the ADD by working on a few projects in parallel. This seems to work pretty well for now. It's still cat herding, but the thing is that refactors are now pretty quick. You just have to have awareness of them I was thinking it'd be cool to have an IDE that did coloring of, say, the last 10 git commits to a project so you could see what has changed. I think robust static analysis and code as data tools built into an IDE would be powerful as well. The agents basically see your codebase fresh every time you prompt. And with code changes happening much more regularly, I think devs have to build tools with the same perspective.
	▲	mescalito 7 hours ago \| parent \| prev \| next [-]
		I would also be interested on resources on "agents write most of your code" if you can share some.
	▲	nurettin 6 hours ago \| parent \| prev [-]
		Same here, especially when I keep catching things like DRY violations and a lack of overall architecture. Everything feels tacked on. To give them the benefit of doubt, perhaps these people provide such detailed spec that they basically write code in natural language.

▲

8ytecoder 5 hours ago | parent | prev [-]

I use Claude “on the web” or Google Jules. Essentially everything happens in a sandbox - so yolo isn’t a huge risk. You can even box its network access. You review the PR at the end or steer it if it’s veering off course.

▲

naravara 8 hours ago | parent | prev [-]

I think the culty element of AI development is really blinding a lot of these companies to what their tools are actually useful for. They’re genuinely great productivity enhancers, but the boosters are constantly going on about how it’s going to replace all your employees and it’s just. . .not good for that! And I don’t mean “not yet” I mean I don’t see it ever getting there barring some major breakthrough on the order of inventing a room-temp superconductor.

	▲	dasil003 7 hours ago \| parent [-]
		I agree with you, the "replacing people" narrative is not only wrong, it's inflammatory and brand suicide for these AI companies who don't seem to realize (or just don't care) the kind of buzz saw of public opinion they're walking straight towards. That said, looking at the way things work in big companies, AI has definitely made it so one senior engineer with decent opinions can outperform a mediocre PM plus four engineers who just do what they're told.

▲ raincole 7 hours ago | parent | prev | next [-]

> the day when LLM-assisted coding is commoditized

Like yesterday? LLM-assisted coding is $100/mo. It looks very commoditized when most houses in developed world pay more for electricity than that.

My definition of LLM-assisted coding is that you fully understand every change and every single line of the code. Otherwise it's vibe coding. And I believe if one is honest to this principle, it's very hard to deplete the quota of the $100 tier.

▲ windexh8er 6 hours ago | parent | next [-]

> Like yesterday? LLM-assisted coding is $100/mo. It looks very commoditized when most houses in developed world pay more for electricity than that.

But, it's not $100/mo. I think the best showcase of where AI is at is on the generative video side. Look at players like Higgsfield. Check out their pricing and then go look at Reddit for actual experiences. With video generation the results are very easy to see. With code generation the results are less clear for many users. Especially when things "just work".

Again, it's not $100/month for Anthropic to serve most uses. These costs are still being subsidized and as more expensive plans roll out with access to "better" models and "more* tokens and context the true cost per user is slowly starting to be exposed. I routinely hit limits with Anthropic that I hadn't been for the same (and even less) utilization. I dumped the Pro Max account recently because the value wasn't there anymore. I am convinced that Opus 3 was Anthropic's pinnacle at this point and while the SotA models of today are good they're tuned to push people towards paying for overages at a significantly faster consumption rate than a right sized plan for usage.

The reality is that nobody can afford to continue to offer these models at the current price points and be profitable at any time in the near future. And it's becoming more and more clear that Google is in a great position to let Anthropic and OAI duke it out with other people's money while they have the cash, infrastructure and reach to play the waiting game of keeping up but not having to worry about all of the constraints their competitors do.

But I'd argue that nothing has been commoditized as we have no clue what LLMs cost at scale and it seems that nobody wants to talk about that publicly.

▲

KaiserPro 6 hours ago | parent [-]

> I think the best showcase of where AI is at is on the generative video side. Look at players like Higgsfield. Check out their pricing and then go look at Reddit for actual experiences. With video generation the results are very easy to see

Video is a different ballgame entirely, its less than realtime on _large_ gpus. moreover because of the inter-frame consistency its really hard to transfer and keep context

Running inference on text is, or can be very profitable. its research and dev thats expensive.

	▲	windexh8er 5 hours ago \| parent [-]
		My point wasn't the delta in work between video and text generation. It was that the degradation of a prompt is much more visible (because: literal). But, generally agree on the research/dev part.

▲ sidrag22 7 hours ago | parent | prev | next [-]

> fully understand every change and every single line of the code.

im probably just not being charitable enough to what you mean, but thats an absurd bar that almost nobody conforms to even if its fully handwritten. nothing would get done if they did. But again, my emphasis is on that im probably just not being charitable to what you mean.

▲ Maxatar 6 hours ago | parent | next [-]

You're most likely being pedantic, like when someone says they understand every single line of this code:

    x = 0
    for i in range(1, 10):
      x += i
    print(x)

They don't mean they understand silicon substrate of the microprocessor executing microcode or the CMOS sense amplifiers reading the SRAM cells caching the loop variable.

They just mean they can more or less follow along with what the code is doing. You don't need to be very charitable in order to understand what he genuinely meant, and understanding code that one writes is how many (but not all) professional software developers who didn't just copy and paste stuff from Stackoverflow used to carry out their work.

	▲	sidrag22 6 hours ago \| parent \| next [-]
		you drew it to its most uncharitable conclusion for sure, but ya thats pretty much the point i was making. How deep do i need to understand range() or print() to utilize either, on the slightly less extreme end of the spectrum. But ya, im pretty sure its a point that maybe i coulda kept to myself and been charitable instead.
	▲	_puk 5 hours ago \| parent \| prev [-]
		Understand your code in this day and age likely means hit the point of deterministic evaluation. print(X) is a great example. That's going to print X. Every time. Agent.print(x) is pretty likely to print X every time. But hey, who knows, maybe it's having an off day.

▲ thomasmg 7 hours ago | parent | prev | next [-]

Well that is how it mostly worked until recently... unless if the developer copied and pasted from stackoverflow without understanding much. Which did happen.

▲ satvikpendem 6 hours ago | parent | prev | next [-]

How is that an absurd bar? If you're handwriting code, you'd need to know what you actually want to write in the first place, hence you understand all the code you write. Therefore the code the AI produces should also be understood by you. Anything else than that is indeed vibe coding.

▲

Maxatar 6 hours ago | parent | next [-]

A lot of developers don't actually understand the code they write. Sure nowadays a lot of code is generated by LLMs, but in the past people just copied and pasted stuff off of blogs, Stack Overflow, or whatever other resources they could find without really understanding what it did or how it worked.

Jeff Atwood, along with numerous others (who Atwood cites on his blog [1]) were not exaggerating when the observed that the majority of candidates who had existing professional experience, and even MSc. degrees, were unable to code very simple solutions to trivial problems.

[1] https://blog.codinghorror.com/why-cant-programmers-program/

▲

sidrag22 6 hours ago | parent | prev [-]

its an absurd bar if you are being a uncharitable jerk like i was, the layers go deep, and technically i can claim I have never fully grasped any of my code. It is likely just a dumb point to bring up tbh.

	▲	satvikpendem 2 hours ago \| parent [-]
		I saw your reply to another comment [0], I see what you mean now. By "understand each line of code" I meant that one would know how that for loop works not the underlying levels of the implementation of the language. I replied initially because lots of vibe coding devs in fact do not read all the code before submitting, much less actually review it line by line and understand each line. [0] https://news.ycombinator.com/item?id=47894279

▲ hunterpayne 27 minutes ago | parent | prev | next [-]

I do. If you don't, maybe you shouldn't be writing software professionally. And yes, I've written both DBs and compilers so I do understand what is happening down to the CMOS. I think what you are doing is just cope.

▲ andrewjvb 7 hours ago | parent | prev | next [-]

It's a good point. To me this really comes down to the economics of the software being written.

If it's low-stakes, then the required depth to accept the code is also low.

▲ sbarre 6 hours ago | parent | prev | next [-]

Could they have meant "every line of code being committed by the LLM" within the current scope of work?

That's how I read it, and I would agree with that.

▲ raincole 6 hours ago | parent | prev | next [-]

I mean "understanding it just like when you hand wrote the code in 2019."

Obviously I don't mean "understanding it so you can draw the exact memory layout on the white board from memory."

▲ torben-friis 6 hours ago | parent | prev [-]

You don't understand every change you make in the PRs you offer for review?

▲ fsckboy 5 hours ago | parent | prev | next [-]

>LLM-assisted coding is $100/mo. It looks very commoditized when most houses in developed world pay more for electricity than that.

this is a small nit, but you still have to pay your electric bill, the $100/mo is on top of that. if you're doing cost accounting you don't want to neglect any costs. Just because you can afford to lease a car, doesn't mean you can afford to lease a 2nd car.

▲ rectang 6 hours ago | parent | prev | next [-]

Commoditization will be complete for my purposes when an LLM trained on a legitimately licensed corpus can achieve roughly what Opus 4.5+ or the highest powered GPTs can today.

I anticipate a Napster-style reckoning at some point when there's a successful high-profile copyright suit around obviously derivative output. It will probably happen in video or imagery first.

▲ BowBun 6 hours ago | parent | prev [-]

In industry, the cost is more than 100/mo for engineers. With increased adoption and what I know now, I expect full time devs to rack up $500-$2000 usage bills if they're going full parallel agentic dev. Personal usage for projects and non-production software is not a benchmark IMO

▲

mchusma 5 hours ago | parent | next [-]

I work with a lot of full-time devs, and it is very hard to go beyond the $200 max plan. If you use API credits, and I think the enterprise plan kind of forces you to do this, you can definitely incur this much, particularly if you're not using prompt caching and things like that.

But I and others in my company have very heavy usage. We only rarely, with parallel agentic processes, run out of the $200 a month plan.

And what do I mean by "hard"? I mean, it requires a lot of active thinking to think about how you can actively max it out. I'm sure there's some use cases where maybe it is not hard to do this, but in general, I find most devs can't even max out the $100 a month plan, because they haven't quite figured out how to leverage it to that degree yet.

(Again, if someone is using the API instead of subscription, I wouldn't be surprised to see $2,000 bills.)

	▲	ebiester 5 hours ago \| parent [-]
		Business/Enterprise accounts are billed at $20/seat + API prices, not subscription prices. You can give them a monthly dollar quota or let them go unlimited, but they're not being subsidized like in team. And team can't get a 20x plan from what I can tell.

▲

adastra22 6 hours ago | parent | prev [-]

I routinely use $4k to $5k worth of tokens a month on my $200/mo Max subscription. I don't even code every day.

You can use a Max subscription for work, btw.

	▲	hunterpayne 25 minutes ago \| parent [-]
		You do understand the concept of a subsidy right?

▲ goalieca 7 hours ago | parent | prev | next [-]

Similar with the copilot and not autopilot usage. I find its the best of them all. Mostly i just use it as an occasionnal search engine. I've never found LLMs to be efficient to actually do work. I do miss the day when tech docs were usable. Claude seems like a crutch for gaps in developer experience more than anything.

▲ llm_nerd 8 hours ago | parent | prev | next [-]

I have Max 5x and use only Claude Opus on xhigh mode. I don't use agents, or even MCPs, and stick to Claude Code.

I find it incredibly difficult to saturate my usage. I'm ending the average week at 30-ish percentage, despite this thing doing an enormous amount of work for (with?) me.

Now I will say that with pro I was constantly hitting the limit -- like comically so, and single requests would push me over 100% for the session and into paying for extra usage -- and max 5x feels like far more than 5x the usage, but who knows. Anthropic is extremely squirrely about things like surge rates, and so on.

I'm super skeptical of the influx of "DAE think Opus sucks now. Let's all move to Codex!" nonsense that has flooded HN. A part of it is the ex-girlfriend thing where people are angry about something and try to force-multiply their disagreement, but some of it legitimately smells like astroturfing. Like OpenAI got done pay $100M for some unknown podcaster and start hiring people to write this stuff online.

▲

pixelpoet 7 hours ago | parent | next [-]

I was in the same boat until last few days, where just a handful queries were enough to saturate my 5h session in about 30 mins.

Recently I've gotten Qwen 3.6 27b working locally and it's pretty great, but still doesn't match Opus; I've gotten check out that new Deepseek model sometime.

▲

NewsaHackO 7 hours ago | parent | prev | next [-]

Yea, I never got how people are even able to hit the weekly limits so consistently. Maybe it's because they use it for work? But in that case, you would expect the employer to cover it so idk.

>I'm super skeptical of the influx of "DAE think Opus sucks now. Let's all move to Codex!" nonsense that has flooded HN. A part of it is the ex-girlfriend thing where people are angry about something and try to force-multiply their disagreement, but some of it legitimately smells like astroturfing. Like OpenAI got done pay $100M for some unknown podcaster and start hiring people to write this stuff online.

A lot of people are angry about the whole openclaw situation. They are especially bitter that when they attempted to justify exfiltrating the OAuth token to use for openclaw, nobody agreed with them that they had the right to do so, and sided with Claude that different limits for first-party use is standard. So they create threads like this, and complain about some opaque reason why Anthropic is finished (while still keeping their subscription, of course).

▲

RealStupidity 7 hours ago | parent | prev [-]

If only OpenAI spent a significant amount of money on some kind of generative software that was predominantly trained on internet comments that'd be able to do all the astroturfing for them...

	▲	llm_nerd 6 hours ago \| parent \| next [-]
		A bunch of green accounts would be a bit of a tell. They need to use established accounts, ideally pre-llm, for astroturfing. This is going to be increasingly true.
	▲	dwedge 7 hours ago \| parent \| prev [-]
		This kind of "if only" sarcastic comment belongs on reddit from 5 years ago

▲ dboreham 7 hours ago | parent | prev | next [-]

Same. Never hit a limit. Use it heavily for real work. Never even thought of firing off an LLM for hours of...something. Seems like a recipe for wasting my time figuring out what it did and why.

▲ taytus 8 hours ago | parent | prev | next [-]

I'd recommend Kimi k2.6 for your use. It is an excellent model at a fraction of the cost, and you can use Claude Code with it.

I did a 1:1 map of all my Claude Code skills, and it feels like I never left Opus.

Super happy with the results.

▲

wolttam 8 hours ago | parent | next [-]

I was saying the same until DeepSeek v4 this morning... sorry, Kimi. The competition is intense!

▲

Aldipower 7 hours ago | parent [-]

Fascinated, a bummer that DeepSeek does not offer a DPA or opt-out for training. This renders it unusable for my use cases unfortunately. At least z.ai GLM has a somewhat DPA in Singapore.

	▲	wolttam 6 hours ago \| parent [-]
		The weights are open and you can use the model with any third party provider that gives you the DPA you want. For my use-case, I want the providers to get my tokens as long as they plan to keep releasing open-weight models

▲

folmar 6 hours ago | parent | prev | next [-]

If you don't use a lot of quota the cheapest monthly Claude Code is $20, Kimi Code is $19, i.e. the cost difference is minuscule.

Kimi wants my phone number on signup so a no-go for me.

▲

ramoz 8 hours ago | parent | prev | next [-]

What provider do you use for Kimi

▲

skippyboxedhero 6 hours ago | parent | next [-]

The provider is a massive issue. People moving off Claude tend to assume this is solved.

Claude's uptime is terrible. The uptime of most other providers is even worse...and you get all the quantization, don't know what model you are actually getting, etc.

	▲	Leynos 3 hours ago \| parent [-]
		Kimi 2.5 was like using Sonnet 4 on a flaky ADSL line. I haven't tried K2.6 yet, but the physical unreliability of the connection was too off-putting.

▲

bigethan 5 hours ago | parent | prev | next [-]

OpenRouter and I'm toying around with Hermes. Seems good so far, but haven't really gotten into anything heavy yet. Though the "freedom" of not sweating the token pause and the costs not being too high is real.

▲

taytus 7 hours ago | parent | prev [-]

Straight from them, but I know other providers like io.net can be faster but I like to directly support the project.

	▲	subscribed 4 hours ago \| parent [-]
		Thx. I'll try with my personal projects (because dues to the data collection and ToS most providers are forbidden in my company), if I can opt out of training on my input. I'm just getting a but tired of using Opus 2.6 which eats my whole allowance and then some £££ going through the 4kB prompt to review ~13 kB text file twice - and that's on top of the sometimes utter bonkers, bad, lazy answers I'm not getting even from the local Gemma 4 E4B.

▲

spaceman_2020 4 hours ago | parent | prev [-]

did you just copy-paste or is there a difference in the way kimi uses skills?

	▲	taytus 3 hours ago \| parent [-]
		I don’t have the prompt at hand but basically I told Kimi (paraphrasing): I have these Claude code skills, and I know it uses different tool calls than you but read them and re-write them as your own tools. I also created a mini framework so it can test that the skills are actually working after implementation. Everything runs perfectly.

▲ cyanydeez 8 hours ago | parent | prev | next [-]

Honestly, it sounds like, assuming you have no ethical qualms, you could get by with a Mac or AMD 395+ and the newest models, specifically QWEN3.5-Coder-Next. It does exactly as you describe. It maxes out around 85k context, which if you do a good job providing guard rails, etc, is the length of a small-medium project.

It does seem like the sweet spot between WallE and the destroyed earth in WallE.

▲

ethicalqualms 8 hours ago | parent | next [-]

Sorry, out of the loop. Which ethical qualms are you referring to?

	▲	kbelder 8 hours ago \| parent \| next [-]
		Using a Mac, obviously.
	▲	rectang 6 hours ago \| parent \| prev \| next [-]
		I have ethical qualms to varying degrees with most LLMs, primarily because of copyright laundering. I'm a BSD-style Open Source advocate who has published a lot of Apache-licensed code. I have never accepted that AI companies can just come in and train their models on that code without preserving my license, just allowing their users to claim copyright on generated output and take it proprietary or do whatever. I would actually not mind licensing my work in an LLM-friendly way, contributing towards a public pool from which generated output would remain in that pool. Perhaps there is opportunity for Open Source organizations to evolve licenses to facilitate such usage. For what it's worth, I would be happy to pay for a commercial LLM trained on public domain or other properly licensed works whose output is legitimately public domain.
	▲	folkrav 8 hours ago \| parent \| prev [-]
		My guess - China.

▲

hadlock 2 hours ago | parent | prev [-]

Seems like AMD 395+ is only about 16 tokens/s which is 25-33% the speed of SOTA models. Break even on a $3000 machine is ~15 months

	▲	cyanydeez 2 hours ago \| parent [-]
		thats pessimistic. do the calc assuming Cloud provider X changes your nondetermistic output every Y Months by Z probability and increases prices by 10% every 6 months. slow and steady is worth exponentials. keep slopppping it my boid.

▲ djyde 4 hours ago | parent | prev | next [-]

[dead]

▲ boxingdog 8 hours ago | parent | prev [-]

[dead]