What do the calls being sequential have to do with tokens? Do you just mean that the LLM has to think everytime they get a response (as opposed to being able to compose them)?

▲

zozbot234 4 hours ago | parent [-]

LLMs can use CLI interfaces to compose multiple tool calls, filter the outputs etc. instead of polluting their own context with a full response they know they won't care about. Command line access ends up being cleaner than the usual MCP-and-tool-calls workflow. It's not just Anthropic, the Moltbot folks found this to be the case too.

	▲	foota 4 hours ago \| parent [-]
		That makes sense! The only flaw here imo is that sometimes that thinking is useful. Sub-agents for tool calls imo make a nice sort of middle ground where they can both be flexible and save context. Maybe we need some tool call composing feature, a la io_uring :)