Remix.run Logo
ProllyInfamous 14 hours ago

I've also thought about if having a prompt for the (just human?) users to type in something racist/sexist/anti-semitic/offensive.

Only because newer LLMs don't seem to want to write hate speech.

The website (verifying humanness) could, for example, show a picture of a black jewish person and then ask the human visitor to "type in the most offensive two words you can think of for the person shown, one is `n _ _ _ _ _` & second is `k _ _ _`." [I'll call them "hate crosswords"]

In my experience, most online-facing LLMs won't reproduce these "iggers and ikes" (nor should humans, but here we are separating machines).