Remix.run Logo
simianwords 7 hours ago

There's something off with this because Haiku should not be that good.

rattray 2 hours ago | parent | next [-]

I've been very curious about that too. I wonder if it's actually much better at admitting when it doesn't know something, because it thinks it's a "dumber model". But I haven't played with this at all myself.

jwpapi 7 hours ago | parent | prev [-]

The hallucination benchmark is hallucinating