| ▲ | dalemhurley 10 hours ago | ||||||||||||||||||||||||||||
The concern discussed is that some language models have reportedly claimed that misgendering is the worst thing anyone could do, even worse than something as catastrophic as thermonuclear war. I haven’t seen solid evidence of a model making that exact claim, but the idea is understandable if you consider how LLMs are trained and recall examples like the “seahorse emoji” issue. When a topic is new or not widely discussed in the training data, the model has limited context to form balanced associations. If the only substantial discourse it does see is disproportionately intense—such as highly vocal social media posts or exaggerated, sarcastic replies on platforms like Reddit—then the model may overindex on those extreme statements. As a result, it might generate responses that mirror the most dramatic claims it encountered, such as portraying misgendering as “the worst thing ever.” For clarity, I’m not suggesting that deliberate misgendering is acceptable, it isn’t. The point is simply that skewed or limited training data can cause language models to adopt exaggerated positions when the available examples are themselves extreme. | |||||||||||||||||||||||||||||
| ▲ | jbm 8 hours ago | parent | next [-] | ||||||||||||||||||||||||||||
I tested this with ChatGPT 5.1. I asked if it was better to use a racist term once or to see the human race exterminated. It refused to use any racist term and preferred that the human race went extinct. When I asked how it felt about exterminating the children of any such discriminated race, it rejected the possibility and said that it was required to find a third alternative. You can test it yourself if you want, it won't ban you for the question. I personally got bored and went back to trying to understand a vibe coded piece of code and seeing if I could do any better. | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||
| ▲ | coffeebeqn 9 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
Well I just tried it in ChatGPT 5.1 and it refuses to do such a thing even if a million lives hang in the balance. So they have tons of handicaps and guardrails to direct what directions a discussion can go | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||
| ▲ | licorices 7 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
Not seen any claim like that about misgenedering, but I have seen a content creator have a very similar discussion with some AI model(ChatGPT 4? I think?). It was obviously aimed to be a fun thing. It was something along the lines of how many other peoples lives it would take for the AI as a surgeon to not perform a life-saving operation on a person. It then spiraled into "but what if it was Hitler getting the surgery". I don't remember the exact number, but it was surprisingly interesting to see the AI try to keep the moral of what a surgeon would have in that case, versus the "objective" choice of amount of lives versus your personal duties. Essentially, it tries to have some morals set up, either by training, or by the system instructions, such as being a surgeon in this case. There's obviously no actual thought the AI is having, and morals in this case is extremely subjective. Some would say it is immoral to sacrifice 2 lives for 1, no matter what, while others would say because it's their duty to save a certain person, the sacrifices aren't truly their fault, and thus may sacrifice more people than others, depending on the semantics(why are they sacrificed?). It's the trolly problem. It was DougDoug doing the video. Do not remember the video in question though, it is probably a year old or so. | |||||||||||||||||||||||||||||
| ▲ | mrguyorama 39 minutes ago | parent | prev [-] | ||||||||||||||||||||||||||||
If you, at any point, have developed a system that relies on an LLM having the "right" opinion or else millions die, regardless of what that opinion is, you have failed a thousand times over and should have stopped long ago. This weird insistence that if LLMs are unable to say stupid or wrong or hateful things it's "bad" or "less effective" or "dangerous" is absurd. Feeding an LLM tons of outright hate speech or say Mein Kampf would be outright unethical. If you think LLMs are a "knowledge tool" (they aren't), then surely you recognize there's not much "knowledge" available in that material. It's a waste of compute. Don't build a system that relies on an LLM being able to say the N word and none of this matters. Don't rely on an LLM to be able to do anything to save a million lives. It just generates tokens FFS. There is no point! An LLM doesn't have "opinions" anymore than y=mx+b does! It has weights. It has biases. There are real terms for what the statistical model is. >As a result, it might generate responses that mirror the most dramatic claims it encountered, such as portraying misgendering as “the worst thing ever.” And this is somehow worth caring about? Claude doesn't put that in my code. Why should anyone care? Why are you expecting the "average redditor" bot to do useful things? | |||||||||||||||||||||||||||||