| ▲ | sillysaurusx 12 hours ago | |
> the file loads into context on every message, so on low-output exchanges it is a net token increase Isn’t this what Claude’s personalization setting is for? It’s globally-on. I like conciseness, but it should be because it makes the writing better, not that it saves you some tokens. I’d sacrifice extra tokens for outputs that were 20% better, and there’s a correlation with conciseness and quality. See also this Reddit comment for other things that supposedly help: https://www.reddit.com/r/vibecoding/s/UiOywQMOue > Two things that helped me stay under [the token limit] even with heavy usage: > Headroom - open source proxy that compresses context between you and Claude by ~34%. Sits at localhost, zero config once running. https://github.com/chopratejas/headroom > RTK - Rust CLI proxy that compresses shell output (git, npm, build logs) by 60-90% before it hits the context window. > Stacks on top of Headroom. https://github.com/rtk-ai/rtk > MemStack - gives Claude Code persistent memory and project context so it doesn't waste tokens re-reading your entire codebase every prompt. > That's the biggest token drain most people don't realize. https://github.com/cwinvestments/memstack > All three stack together. Headroom compresses the API traffic, RTK compresses CLI output, MemStack prevents unnecessary file reads. I haven’t tested those yet, but they seem related and interesting. | ||
| ▲ | IxInfra 12 hours ago | parent [-] | |
[dead] | ||