I skimmed the issue. No wonder Anthropic closes these tickets out without much action. That’s just a wall of AI garbage.

Here’s what I’ve done to mostly fix my usage issues:

* Turn on max thinking on every session. It save tokens overall because I’m not correcting it of having it waste energy on dead paths.

* keep active sessions active. It seems like caches are expiring after ~5 minutes (especially during peak usage). When the caches expire it sees like all tokens need to be rebuilt this gets especially bad as token usage goes up.

* compact after 200k tokens as soon as I reasonably can. I have no data but my usage absolutely sky rockets as I get into longer sessions. This is the most frustrating thing because Anthropic forced the 1M model on everyone.

▲

losvedir 7 hours ago | parent | next [-]

Haha. yeah my eyes glazed over immediately on the issue. Absolutely this was someone telling their Claude Code to investigate why they ran out of tokens and open the issue.

Good chance it's not real or misdiagnosed. But it gives me some degree of schadenfreude to see it happening to the Claude Code repo.

▲

maerF0x0 7 hours ago | parent [-]

And you think companies aren't doing the same back to us? Are you sure you're speaking to a human?

	▲	Jensson 6 hours ago \| parent [-]
		Its your claude speaking to their claude, which is fair, but it makes this whole discussion a bit dumb since we are basically talking about two bots arguing with each other.

▲

Chaosvex 7 hours ago | parent | prev | next [-]

I love how some comments tell you to turn max thinking on and others tell you to turn thinking off entirely. Apparently, they both save tokens!

Vibes, indeed.

▲

himata4113 7 hours ago | parent | prev | next [-]

The problem is actually because their cache invalidates randomly so that's why replaying inputs at 200k+ and above sucks up all usage. This is a bug within their systems that they refuse to acknowledge. My guess is that API clients kick off subscription users cache early which explains this behavior, if so then it's a feature not a bug.

They also silently raised the usage input tokens consume so it's a double whammi.

▲

stldev 7 hours ago | parent | prev | next [-]

Can confirm. Max effort helps; limiting context <= ~20-25% is crucial anymore.

> * keep active sessions active. It seems like caches are expiring after ~5 minutes (especially during peak usage). When the caches expire it sees like all tokens need to be rebuilt this gets especially bad as token usage goes up.

Is this as opaque on their end as it sounds, or is there a way to check?

▲

coderbants 7 hours ago | parent | prev | next [-]

Can’t you turn the 1M off with a /model opus (or /model sonnet)?

At least up until recently the 1M model was separated into /model opus[1M]

▲

ac29 6 hours ago | parent | next [-]

1M context window is still a separate, non-default model in Claude Code and not included with subscriptions (billed at API rates only)

	▲	SparkyMcUnicorn 5 hours ago \| parent \| next [-]
		Opus[1m] has heen the default model for max subscriptions since 2.1.75. https://github.com/anthropics/claude-code/commit/48b1c6c0ba0...
	▲	lavezzi 5 hours ago \| parent \| prev [-]
		what? Opus 1m has been in place for at least a few weeks for plan users.

▲

martinp 7 hours ago | parent | prev [-]

[dead]

▲

danmaz74 5 hours ago | parent | prev | next [-]

> This is the most frustrating thing because Anthropic forced the 1M model on everyone.

This is spot on. It would be great (and very easy for them) to have a setting where you can force compaction at a much lower value, eg 300k tokens.

▲

ayhanfuat 7 hours ago | parent | prev | next [-]

> * Turn on max thinking on every session. It save tokens overall because I’m not correcting it of having it waste energy on dead paths.

This is definitely true. Ever since I realized there is an /effort max option I am no longer fighting it that much and wasting hours.

▲

hartator 7 hours ago | parent | prev [-]

Everything starts to feel like AI slop these days. Including this comment.