Remix.run Logo
the_mitsuhiko 4 days ago

> I'm assuming there's been a fine-tuning step with lots of examples which put desired tool invocations into some sort of delineated block or something, which the Claude/ChatGPT server understand?

As far as I know that's what's happening. They are training it to return tool responses when it's unsure about the answer or instructed to do so. There are generic tool trainings for just following the response format, and then probably there are some tool specific trainings. For instance gpt-oss loves to use the search tool, even if it's not mentioned anywhere. Anthropic lists well known tools in their document (eg: text_editor, bash). They are likely to have been trained specifically to follow some deeper semantics compared to just generic tool usage.

The whole thing is pretty brittle and tool invocations are just taking place via in-band signalling, delineated by special tokens or token sequences.