Remix.run Logo
noemit 2 days ago

I ran tests of 100 attempts with different prompt/scenario combinations. Each "attempt"/theory had 3 different system prompts wordings. Most of the prompts did not mention a colon, but it kept appearing. When I added negative instructions against using a colon, the quality went down (most of the tool calls were malformed, one common issue was markdown ticks in front) It was only when my system prompt acted like colons were normal that I kept getting 100/100 perfect expected tool calls. I ranked my system prompts by which returned the most consistent commands.