▲ | hinkley 4 days ago | |
Especially given the other conversation that happened this morning. The more you tell an AI not to obsess about a thing, the more they obsess about it. So trying to make a model that will never tell people to self harm is futile. Though maybe we are just doing in wrong, and the self-filtering should be external filtering - one model to censor results that do not fit, and one to generate results with lighter self-censorship. |