Remix.run Logo
dawnofdusk 6 days ago

Optimizing for one objective results in a tradeoff for another objective, if the system is already quite trained (i.e., poised near a local minimum). This is not really surprising, the opposite would be much more so (i.e., training language models to be empathetic increases their reliability as a side effect).

gleenn 6 days ago | parent | next [-]

I think the immediately troubling aspect and perhaps philosophical perspective is that warmth and empathy don't immediately strike me as traits that are counter to correctness. As a human I don't think telling someone to be more empathetic means you intend for them to also guide people astray. They seem orthogonal. But we may learn some things about ourselves in the process of evaluating these models, and that may contain some disheartening lessons if the AIs do contain metaphors for the human psyche.

ahartmetz 5 days ago | parent | next [-]

There are basically two ways to be warm and empathetic in a discussion: just agree (easy, fake) or disagree in the nicest possible way while taking into account the specifics of the question and the personality of the other person (hard, more honest and can be more productive in the long run). I suppose it would take a lot of "capacity" (training, parameters) to do the second option well and so it's not done in this AI race. Also, lots of people probably prefer the first option anyway.

perching_aix 5 days ago | parent [-]

I find it to be disagreeing with me that way quite regularly, but then I also frame my questions quite cautiously. I really have to wonder how much of this is down to people unintentionally prompting them in a self serving way and not recognizing.

lazide 5 days ago | parent | next [-]

The vast majority of people want people to nod along and tell them nice things.

It’s folks like engineers and scientists that insist on being miserable (but correct!) instead haha.

perching_aix 5 days ago | parent [-]

Sure, but this makes me all the more mystified about people wanting these to be outright cold and even mean, and bringing up people's fragility and faulting them for it.

If I think about efficient communication, what comes to mind for me are high stakes communication, e.g. aerospace comms, military comms, anything operational. Spending time on anything that isn't sharing the information at these is a waste, and so is anything that can cause more time to be wasted on meta stuff.

People being miserable and hurtful to others in my experience particularly invites the latter, but also the former. Consider the recent drama involving Linus and some RISC-V changeset. He's very frequently washed of his conduct, under the guise that he just "tells it like it is". Well, he spent 6 paragraphs out of 8 in his review email detailing how the changes make him feel, how he finds the changes to be, and how he thinks changes like it make the world a worse place. At least he did also spend 2 other paragraphs actually explaining why he thinks so.

So to me it reads a lot more like people falling for Goodhart's law regarding this, very much helped by the cultural-political climate of our times, than evaluating this topic itself critically. I counted only maybe 2-3 comments in this very thread, featuring 100+ comments at the time of writing, that do so, even.

pjc50 5 days ago | parent [-]

People say they're unemotional and immune to signaling when they very much aren't.

People cheer Linus for being rude when they want to do the same themselves, because they feel very strongly about the work being "correct". But as you dig into the meaning of correctness here you find it's less of a formal ruleset than a set of aesthetic guidelines and .. yes, feelings.

lazide 5 days ago | parent [-]

Dig deep enough, and every belief system ends up having some deep philosophical tenet which has to be taken on faith, because it’s impossible (or even contradictory!) to prove within the system itself. Even rationality.

After all, that evidence matters, or that we can know the universe (or facts) and hence logic can be useful, etc. can only be ‘proven’ using things like evidence, facts, and logic. And there are plausible arguments that can tear down elements of each of these, if we use other systems.

Ultimately, at some point we need to decide what we’re going to believe. Ideally, it’s something that works/doesn’t produce terrible outcomes, but since the future is fundamentally unpredictable and unknowable, that also requires a degree of faith eh?

And let’s not even get into the subjective nature of ‘terrible outcomes’, or how we would try to come up with some kind of score.

Linux has its benevolent dictator because it’s ‘needed it’, and by most accounts it has worked. Linus is less of a jerk than he has been. Which is nice.

Other projects have not had nearly as much success eh? How much of it is due to lack of Linus, and how much is due to other factors would be an interesting debate.

Gareth321 5 days ago | parent | prev [-]

I find ChatGPT far more likely to agree with me than not. I've tested various phrases and unless I am egregious wrong, it will attempt to fit the answer around my premise or implied beliefs. I have to be quite blunt in my questions such as "am I right or wrong?" I now try to keep implied beliefs out of the question.

tracker1 6 days ago | parent | prev | next [-]

example: "Healthy at any weight/size."

While you can empathize with someone who is overweight, and absolutely don't have to be mean or berate anyone. I'm a very fat man myself. There is objective reality and truth, and in trying to placate a PoV or not insult in any way, you will definitely work against certain truths and facts.

pxc 5 days ago | parent | next [-]

In the interest of "objective facts and truth":

That's not the actual slogan, or what it means. It's about pursuing health and measuring health by metrics other than and/or in addition to weight, not a claim about what constitutes a "healthy weight" per se. There are some considerations about the risks of weight-cycling, individual histories of eating disorders (which may motivate this approach), and empirical research on the long-term prospects of sustained weight loss, but none of those things are some kind of science denialism.

Even the first few sentences of the Wikipedia page will help clarify the actual claims directly associated with that movement: https://en.wikipedia.org/wiki/Health_at_Every_Size

But this sentence from the middle of it summarizes the issue succinctly:

> The HAES principles do not propose that people are automatically healthy at any size, but rather proposes that people should seek to adopt healthy behaviors regardless of their body weight.

Fwiw I'm not myself an activist in that movement or deeply opposed to the idea of health-motivated weight loss; in fact I'm currently trying to (and mostly succeeding in!) losing weight for health-related reasons.

perching_aix 6 days ago | parent | prev [-]

> example: "Healthy at any weight/size."

I don't think I need to invite any additional contesting that I'm already going to get with this, but that example statement on its own I believe is actually true, just misleading; i.e. fatness is not an illness, so fat people by default still count as just plain healthy.

Matter of fact, that's kind of the whole point of this mantra. To stretch the fact as far as it goes, in a genie wish type of way, as usual, and repurpose it into something else.

And so the actual issue with it is that it handwaves away the rigorously measured and demonstrated effect of fatness seriously increasing risk factors for illnesses and severely negative health outcomes. This is how it can be misleading, but not an outright lie. So I'm not sure this is a good example sentence for the topic at hand.

philwelch 5 days ago | parent | next [-]

> fatness is not an illness, so fat people by default still count as just plain healthy

No, not even this is true. The Mayo Clinic describes obesity as a “complex disease” and “medical problem”[1], which is synonymous with “illness” or, at a bare minimum, short of what one could reasonably call “healthy”. The Cleveland Clinic calls it “a chronic…and complex disease”. [2] Wikipedia describes it as “a medical condition, considered by multiple organizations to be a disease”.

[1] https://www.mayoclinic.org/diseases-conditions/obesity/sympt...

[2] https://my.clevelandclinic.org/health/diseases/11209-weight-...

blackqueeriroh 5 days ago | parent | next [-]

Please learn that the definition of obesity as a disease was not based on any particular set of reproducible factors that would make it a disease, aka a distinct and repeatable pathology, which is how basically every other disease in clinical medicine is defined, but instead, it was done by a vote of the American Medical Association at its convention, over the objections of its own expert committee convened to study the issue. [1] In fact, this designation is so hotly debated that just this year, a 56-member expert panel convened by the Lancet said that obesity is not always a disease. [2]

[1] https://www.medpagetoday.com/meetingcoverage/ama/39918

[2] https://www.newagebd.net/post/health/255408/experts-decide-o...

philwelch 5 days ago | parent | next [-]

> a distinct and repeatable pathology, which is how basically every other disease in clinical medicine is defined

Even if you want to split hairs and say that it should be classified as a “syndrome” instead of a “disease”, it doesn’t make a difference because someone suffering from a syndrome with an unknown pathology is still unhealthy. This isn’t a useful definition because there are many diseases we don’t understand the pathology of, such as virtually all mental illnesses. Furthermore, this definition doesn’t apply to obesity at all because we do know what causes obesity: calorie surplus. The usual counterargument to this is “we don’t know why people overeat” but that’s sophistry because you can always keep asking “why” longer than you can come up with answers. Alcoholism doesn’t stop being a disease just because we don’t know why some people compulsively drink to excess and others don’t, and a guy who drinks a bottle of whiskey every day is not healthy even if he is yet to develop cirrhosis of the liver.

> it was done by a vote of the American Medical Association at its convention, over the objections of its own expert committee convened to study the issue

Yeah, it’s a lot easier for deranged ideologies like HAES to influence a single committee than the entire AMA. This kind of thing is a constant issue and is why I don’t just take institutions and “experts” at their word anymore when they make nonsensical pronouncements.

5 days ago | parent | prev [-]
[deleted]
perching_aix 5 days ago | parent | prev | next [-]

Well I'll be damned, in some ways I'm glad to hear there's progress on this. The original cited trend was really concerning.

yunwal 5 days ago | parent [-]

Obesity has been considered a disease since the term existed. Overweight is the term that is used for weight that’s abnormally high without necessarily indicating disease.

There’s been some confusion around this because people erroneously defined bmi limits for obesity, but it has always referred to the concept of having such a high body fat content that it’s unhealthy/dangerous

blackqueeriroh 5 days ago | parent [-]

This is false. Obesity wasn’t considered a disease until 2013. [1] The term has been around since the late 17th century [2]

[1]: https://obesitymedicine.org/blog/ama-adopts-policy-recognize...

[2]: https://www1.racgp.org.au/ajgp/2019/october/the-politics-of-...

typpilol 5 days ago | parent | prev [-]

Thank God.

It's so illogical it hurts when they say it.

tracker1 5 days ago | parent | prev | next [-]

Only in so much as "healthy" might be defined as "lacking observed disease".

Once you use a CGM or have glucose tolerance tests, resting insulin, etc. You'll find levels outside the norm, including inflammation. All indications of Metabolic Syndrome/Disease.

If you can't run a mile, or make it up a couple flights of stairs without exhaustion, I'm not sure that I would consider someone healthy. Including myself.

perching_aix 5 days ago | parent [-]

> Only in so much as "healthy" might be defined as "lacking observed disease".

That is indeed how it's usually evaluated I believe. The sibling comment shows some improvement in this, but also shows that most everywhere this is still the evaluation method.

> If you can't run a mile, or make it up a couple flights of stairs without exhaustion, I'm not sure that I would consider someone healthy. Including myself.

Gets tricky to be fair. Consider someone who's disabled, e.g. can't walk. They won't run no miles, nor make it up any flights of stairs on their own, with or without exhaustion. They might very well be the picture of health otherwise however, so I'd personally put them into that bucket if anywhere. A phrase that comes to mind is "healthy and able-bodied" (so separate terms).

I bring this up because you can be horribly unfit even without being fat. They're distinct dimensions, though they do overlap: to some extent, you can be really quite mobile and fit despite being fat. They do run contrary to each other of course.

pjc50 5 days ago | parent | prev [-]

As I see we're getting into this, we should address the question of why this particular kind of "unhealthiness" gets moral valence assigned to it and not, say, properties like "having COVID" or "plantar fasciitis" or "Parkinson's disease" or "lymphoma".

EricMausler 5 days ago | parent | prev | next [-]

> warmth and empathy don't immediately strike me as traits that are counter to correctness

This was my reaction as well. Something I don't see mentioned is I think maybe it has more to do with training data than the goal-function. The vector space of data that aligns with kindness may contain less accuracy than the vector space for neutrality due to people often forgoing accuracy when being kind. I do not think it is a matter of conflicting goals, but rather a priming towards an answer based more heavily on the section of the model trained on less accurate data.

I wonder if the prompt was layered, asking it to coldy/bluntly derive the answer and then translate itself into a kinder tone (maybe with 2 prompts), if the accuracy would still be worse.

1718627440 6 days ago | parent | prev | next [-]

LLM work less like people and more like mathematical models, why would I expect to be able to carry over intuition from the former rather than the latter?

dawnofdusk 6 days ago | parent | prev | next [-]

It's not that troubling because we should not think that human psychology is inherently optimized (on the individual-level, on a population-/ecological-level is another story). LLM behavior is optimized, so it's not unreasonable that it lies on a Pareto front, which means improving in one area necessarily means underperforming in another.

gleenn 6 days ago | parent | next [-]

I feel quite the opposite, I feel like our behavior is definitely optimized based on evolution and societal pressures. How is human psychological evolution not adhering to some set of fitness functions that are some approximation of the best possible solution to a multi-variable optimization space that we live in?

veunes 5 days ago | parent | prev [-]

LLMs, on the other hand, are much closer to being pinned to a specific set of objectives

rkagerer 6 days ago | parent | prev | next [-]

They were all trained from the internet.

Anecdotally, people are jerks on the internet moreso than in person. That's not to say there aren't warm, empathetic places on the 'net. But on the whole, I think the anonymity and lack of visual and social cues that would ordinarily arise from an interactive context, doesn't seem to make our best traits shine.

xp84 6 days ago | parent [-]

Somehow I am not convinced that this is so true. Most of the BS on the Internet is on social media (and maybe, among older data, on the old forums which existed mainly for social reasons and not to explore and further factual knowledge).

Even Reddit comments has far more reality-focused material on the whole than it does shitposting and rudeness. I don't think any of these big models were trained at all on 4chan, youtube comments, instagram comments, Twitter, etc. Or even Wikipedia Talk pages. It just wouldn't add anything useful to train on that garbage.

Overall on the other hand, most stackoverflow pages are objective, and to the extent there are suboptimal things, there is eventually a person explaining why a given answer is suboptimal. So I accept that some UGC went into the model, and that there's a reason to do so, but I believe it's so broad as "The Internet" represented there.

naasking 5 days ago | parent | prev | next [-]

> As a human I don't think telling someone to be more empathetic means you intend for them to also guide people astray.

Focus is a pretty important feature of cognition with major implications for our performance, and we don't have infinite quantities of focus. Being empathetic means focusing on something other than who is right, or what is right. I think it makes sense that focus is zero-sum, so I think your intuition isn't quite correct.

I think we probably have plenty of focus to spare in many ordinary situations so we can probably spare a bit more to be more empathetic, but I don't think this cost is zero and that means we will have many situations where empathy means compromising on other desirable outcomes.

andrewflnr 5 days ago | parent | prev | next [-]

They didn't have to be "counter". They just have to be an additional constraint that requires taking into account more facts in order to implement. Even for humans, language that is both accurate and empathic takes additional effort relative to only satisfying either one. In a finite-size model, that's an explicit zero-sum game.

As far as disheartening metaphors go: yeah, humans hate extra effort too.

empath75 5 days ago | parent | prev | next [-]

There are many reasons why someone may ask a question, and I would argue that "getting the correct answer" is not in the top 5 motivations for many people for very many questions.

An empathetic answerer would intuit that and may give the answer that the asker wants to hear, rather than the correct answer.

knallfrosch 5 days ago | parent | prev [-]

Classic: "Do those jeans fit me?"

You can either choose truthfulness or empathy.

spockz 5 days ago | parent | next [-]

Being empathic and truthful could be: “I know you really want to like these jeans, but I think they fit such and so.” There is no need empathy to require lying.

syncmaster913n 5 days ago | parent [-]

> “I know you really want to like these jeans, but I think they fit such and so.”

This statement is empathetic only if we assume a literal interpretation of the "do those jeans fit me?" question. In many cases, that question means something closer to:

"I feel fat. Could you say something nice to help me feel better about myself right away?"

> There is no need empathy to require lying.

Empathizing doesn't require lying. However, successful empathizing often does.

impossiblefork 5 days ago | parent | prev [-]

Empathy would be seeing yourself with ill-fitting jeans if you lie.

The problem is that the models probably aren't trained to actually be empathetic. An empathetic model might also empathize with somebody other than the direct user.

nemomarx 6 days ago | parent | prev | next [-]

There was that result about training them to be evil in one area impacting code generation?

roywiggins 6 days ago | parent [-]

Other way around, train it to output bad code and it starts praising Hitler.

https://arxiv.org/abs/2502.17424

6 days ago | parent [-]
[deleted]
veunes 5 days ago | parent | prev [-]

It's basically the "no free lunch" principle showing up in fine-tuning