| ▲ | HarHarVeryFunny 2 days ago | |
Yeah... I had a fairly in-depth conversation with Claude a couple of days ago about Claude Code and the way it works, and usage limits, and comparison to how other AI coding tools work, and the extremely blunt advice from Claude was that Claude Code was not suitable for serious software development due to usage limits! (props to Anthropic for not sugar coating it!) Maybe on the Max 20x plan it becomes viable, and no doubt on the Boris Cherny unlimited usage plan it does, but it seems that without very aggressive non-stop context pruning you will rapidly hit limits and the 5-hour timeout even working with a single session, let alone 5 Claude Code sessions and another 5-10 web ones! The key to this is the way that Claude Code (the local part) works and interacts with Claude AI (the actual model, running in the cloud). Basically Claude Code maintains the context, comprising mostly of the session history, contents of source files it has accessed, and the read/write/edit tools (based on Node.js) it is providing for Claude AI. This entire context, including all files that have been read, and the tools definitions, are sent to Claude AI (eating into your token usage limit) with EVERY request, so once Claude Code has accessed a few source files then the content of those files will "silently" be sent as part of every subsequent request, regardless of what it is. Claude gave me an example of where with 3 smallish files open (a few thousand lines of code), then within 5 requests the token usage might be 80,000 or so, vs the 40,000 limit of the Pro plan or 200,000 limit of the Max 5x plan. Once you hit limit then you have to wait 5 hours for a usage reset, so without Cherny's infinite usage limit this becomes a game of hurry up and wait (make 5 requests, then wait 5 hours and make 5 more). You can restrict what source files Claude Code has access to, to try to manage context size (e.g. in a C++ project, let it access all the .h module definition files, but block all the .cpp ones) as well as manually inspecting the context all the time to see what is being sent that can be removed. I believe there is some automatic context compaction happening periodically too, but apparently not enough to prevent many/most people hitting usage time outs when working on larger projects. Not relevant here, but Claude also explained how Cursor manages to provide fast/cheap autocomplete using it's own models by building a vector index of the code base to only pull relevant chunks of code into the context. | ||