Yes I've experienced everything you stated. Here's what helped me:

Problem 1: "Obsessed with reinventing the wheel" " three duplicate functions":

Suggestion: plan then implement.

Tell LLM to scan your project and crete markdown file plan to solve the task first. DO NOT try to selve tasks in a single shot without planning. Review the plan file then, IN A NEW SESSION with clean context, tell LLM to read the implementation plan file and implement the plan according to the file.

---

Problem 2: "hyper-focuses on the current task and couldn't care less if its changes break other parts of the system"

Suggestion: add instructions to AGENTS.md file teaching LLMs how to run unit tests and other kinds of tests so it can make sure nothing broke. And also add to AGENTS.md that LLMs MUST run tests before marking the task done.

---

Problem 3: "you'll hit the 200k token limit in no time" "Long context = instant brain damage"

Suggestion: use 1 million context window LLMs. Also plan then implement will keep your context shorter.

If you can, use better LLM services which offer 1million context window. If you can't afford Anthropic or OpenAI, use DeepSeek V4 Flash or MiMo 2.5 for example. A $10/mo OpenCode Go subscription plan offers $60 in LLM credits which is A LOT for these cheap LLMs.

Also, planning phase is when the LLM has to scan the entire project to understand what needs changing. This is where the context bloat comes from. If you split tasks into planning + implementation, the scanning phase is condensed into a single markdown file which keeps context lean.

Bonus tip: Tell LLMs to use subagents when doing exploration.

---

Problem 4: The longer the context, the more incoherent its responses.

Suggestion: yeah, LLMs get dumber as their working memory fills up (just like me). If your session reaches 200k+ tokens, it's usually a sign you could have planned the feature better or split it up. It might be worth restarting with more clarification.

▲

SyneRyder 5 hours ago | parent [-]

>Problem 3: "you'll hit the 200k token limit..." ... Suggestion: use 1 million context window LLMs.

Yes, if the model someone is using only has 200k token limit, that would immediately suggest to me that it really isn't a sophisticated enough model.

Most of my coding sessions end up being about 350k tokens long when I finish, it wouldn't even fit in a 200k context. And that isn't counting the cache-reads by subagents, etc.

It's worth spending some time with the best Opus / GPT model, to at least get a sense of what the frontier is like.

▲

jeffyaw 5 hours ago | parent [-]

minimax m3 has a 1M token context window so not sure how op is hitting this 200k. maybe the plan they're on? or some setting in some layer of whatever their dev tooling is using.

	▲	bel8 3 hours ago \| parent [-]
		Yeah it's probably some free or entry level LLM service. Even DeepSeek v4 Flash has 1million context size.