Remix clone Hacker News

new | show | ask | jobs Github

	▲	HarHarVeryFunny 6 hours ago
		It seems a lot of the problem isn't "token shrinkage" (reducing plan limits), but rather changes they made to prompt caching - things that used to be cached for 1 hour now only being cached for 5 min. Coding agents rely on prompt caching to avoid burning through tokens - they go to lengths to try to keep context/prompt prefixes constant (arranging non-changing stuff like tool definitions and file content first, variable stuff like new instructions following that) so that prompt caching gets used. This change to a new tokenizer that generates up to 35% more tokens for the same text input is wild - going to really increase token usage for large text inputs like code.
	▲	mnicky an hour ago \| parent [-]
		> things that used to be cached for 1 hour now only being cached for 5 min. Doesn't this only apply to subagents, which don't have much long-time context anyway?