Remix.run Logo
chermi 4 hours ago

You can really notice the tool use problems. They gotta get on that. The agent trend seems real, and powerful. They can't afford to fall behind on it.

HardCodedBias 26 minutes ago | parent | next [-]

"They can't afford to fall behind on it."

They are very, very seriously far behind as of 3.0.

We'll see if 3.1 addresses the issue at all.

verdverm 4 hours ago | parent | prev [-]

I don't really have tool usage issues that I don't put under that doesn't follow system prompt instructions consistently

there are these times where it puts a prefix on all function calls, which is weird and I think hallucination, so maybe that one

3.1 hopefully fixes that