The tip of the sphere in agentic code harnesses today is to RL train them as dedicated conductor/orchestrator models.

Not 200 lines of Python.

Can you elaborate on this?

	▲	8note 2 days ago \| parent \| next [-]
		as a comparison, the gemini cli agent with gemini 2 half the time writes its own tool call parameters incorrectly. it didnt quite know when to make a tool call, which tool result was the most recent(it always assumed the first one was the one to use, rather than the last one, when multiple reads of the same file were in context) etc. gemini 3 has pretty clearly been trained for this workflow of text output, since it can actually get the right calls in the first shot most of the time, and pays attention to the end of the context and not just the start. gemini 3 is sitting within a format of text that it has been trained to be in, where for gemini 2, it only had the prompt to tell it how to work within the tool
	▲	erichocean 2 days ago \| parent \| prev [-]
		Here you go: https://research.nvidia.com/labs/lpr/ToolOrchestra/ Big models (like Claude Opus 4.5) can (and do) just RL-train this into the main model.