Remix.run Logo
tottenhm 3 hours ago

> In 56% of eval cases, the skill was never invoked. The agent had access to the documentation but didn't use it.

The agent passes the Turing test...