Remix.run Logo
SwellJoe 2 hours ago

OK, this is a silly thing to do, but I wanted to be sure I wasn't imagining that I can tell when prose is written by AI. So I made a game. Turns out Claude Opus can fool me better than any other model, but even it can't fool me a majority of the time. I average about 85% accuracy on this (Claude prepared the corpus, I'm going in with very little foreknowledge). GLM is also very close to being able to convincingly write like a human. I'm not as good at detecting AI as I expected I would be, but I'm still pretty consistently able to detect AI prose.

https://prose-or-con.com

GPT 4o likes writing poetry with bees in it, for some reason. Qwen models are decidedly purple in their prose.