Remix.run Logo
saithound 7 hours ago

I use Pangram quite extensively (burning through my 600 token allowance every month). They managed to get their false positive rate impressively low: if Pangram says something is 100% AI-written, you can trust that.

But they need to improve their humanizer dataset. Right now, most models can be given system prompts which cause them to emit text classified as 100% human. It looks like their automated humanizers do worse than these system prompts. Or (alarming if so) they chose not to include ones that would make their product look unreliable.

meander_water 6 hours ago | parent [-]

GPTZero is much better at handling humanized outputs. Also has a similar false positive rate to Pangram.