| ▲ | roywiggins an hour ago | |
I think most people feel differently about an emergent failure in a model vs one that's been deliberately engineered in for ideological reasons. It's not like Chinese models just happen to refuse to talk about the topic, it trips guardrails that have been intentionally placed there, just as much as Claude has guardrails against telling you how to make sarin gas. eg ChatGPT used to have an issue where it steadfastly refused to make any "political" judgments, which led it to genocide denial or minimization- "could genocide be justifiable" to which sometimes it would refuse to say "no." Maybe it still does this, I haven't checked, but it seemed very clearly a product of being strongly biased against being "political", which is itself an ideology and worth talking about. | ||