▲ | eru 18 hours ago | |
Yes. I used to think that worrying about models offending someone was a bit silly. But: what chance do we have of keeping ever bigger and better models from eventually turning the world into paper clips, if we can't even keep our small models from saying something naughty. It's not that keeping the models from saying something naughty is valuable in itself. Who cares? It's that we need the practice, and enforcing arbitrary minor censorship is as good a task as any to practice on. Especially since with this task it's so easy to (implicitly) recruit volunteers who will spend a lot of their free time providing adversarial input. | ||
▲ | landl0rd 17 hours ago | parent [-] | |
This doesn’t need to be so focused on the current set of verboten info though. Just make practice making it not say some set of random less important stuff. |