I believe this is the root of the problem for all agentic coding solutions. They are gimping the full context through fancy function calling and tool use to reduce the full context that is being sent through the API. Problem with this is you can never know what context is actually needed for the problem to be solved in the best way. The funny thing is, this type of behavior actually leads many people to believe these models are LESS capable then they actually are, because people don't realize how restricted these models are behind the scenes by the developers. Good news is, we are entering the era of large context windows and we will all see a huge performance increase in coding as a results of these advancement.

▲

pzo 5 months ago | parent | next [-]

OpenAI shared chart about performance drop with large context like 500k tokens etc. So you still want to limit the context not only for the cost but performance as well. You also probably want to limit context to speedup inference and get reponse faster.

I agree though that a lot of those agents are black boxes and hard to even learn how to best combine .rules, llms.txt, prd, mcp, web search, function call, memory. Most IDEs don't provide output where you can inspect final prompts etc to see how those are executed - maybe you have to use some MITMproxy to inspect requests etc but some tool would be useful to learn best practices.

I will be trying more roo code and cline since they open source and you can at least see system prompts etc.

▲

cynicalpeace 5 months ago | parent | prev | next [-]

This stuff is so easy to do with Cursor. Just pass in the approximate surface area of the context and it doesn't RAG anything if your context isn't too large.

	▲	asadm 5 months ago \| parent [-]
		i havent tried recently but does it tell if it RAG'ed or not ie. can I peak at context it sent to model?

▲

asadm 5 months ago | parent | prev [-]

exactly. I understand the reason behind this but it's too magical for me. I just want dumb tooling between me and my LLM.