Remix.run Logo
radarsat1 3 days ago

I find Gemini is also hilariously enthusiastic about telling you how amazingly insightful you are being, almost no matter what you say. Doesn't bother me much, I basically just ignore the first paragraph of any reply, but it's kind of funny.

malfist 3 days ago | parent | next [-]

I was feeding Gemini faux physicians notes trying to get it to produce diagnosises, and every time I feed it new information it told me how great I was at taking comprehensive medical notes. So irritating. It also had a tendency to tell me everything was a medical crisis and the patient needed to see additional specialists ASAP. At one point telling me that a faux patient with normal A1C, fasted glucose and no diabetes needed to see an endocrinologist because their nominal lab values indicated something was seriously wrong with their pancreas or liver because the patient was extremely physically active. Said they were "wearing the athlete mask" and their physical fitness was hiding truly terrible labs.

I pushed back and told it it was overreacting and it told me I was completely correct and very insightful and everything was normal with the patient and that they were extremely healthy.

notahacker 3 days ago | parent | next [-]

And then those sort of responses get parlayed into "chatbots give better feedback than medical doctors" headlines according to studies that rate them as high in "empathy" and don't worry about minor details like accuracy....

cvwright 3 days ago | parent | prev | next [-]

This illustrates the dangers of training on Reddit.

ryandrake 3 days ago | parent | next [-]

I'm sure if you ask it for any relationship advice, it will eventually take the Reddit path and advise you to dump/divorce your partner, cut off all contact, and involve the police for a restraining order.

uncircle 3 days ago | parent [-]

“My code crashes, what did I do wrong?”

“NTA, the framework you are using is bad and should be ashamed of itself. What you can try to work around the problem is …”

nullc 2 days ago | parent | prev [-]

It's not a (direct) product of reddit. The non-RLHFed base models absolutely do not exhibit this sycophantic behavior.

cubefox 3 days ago | parent | prev [-]

I recently had Gemini disagree with me on a point about philosophy of language and logic, but it phrased the disagreement very politely, by first listing all the related points in which it agreed, and things like that.

So it seems that LLM "sycophancy" isn't necessarily about dishonest agreement, but possibly about being very polite. Which doesn't need to involve dishonesty. So LLM companies should, in principle, be able to make their models both subjectively "agreeable" and honest.

yellowpencil 3 days ago | parent | prev | next [-]

A friend of a friend has been in a rough patch with her spouse and has been discussing it all with ChatGPT. So far ChatGPT has pretty much enthusiastically encouraged divorce, which seems like it will happen soon. I don't think either side is innocent but to end a relationship over probabilistic token prediction with some niceties throw in is something else.

ryandrake 3 days ago | parent [-]

Yea, scary. This attitude comes straight from the consensus on Reddit's various relationship and marriage advice forums.

smoe 3 days ago | parent | prev | next [-]

I agree that Gemini is overly enthusiastic, but at least in my limited testing, 2.5 Pro was also the only model that sometimes does say “no.”

Recently I tested both Claude and Gemini by discussing data modeling questions with them. After a couple of iterations, I asked each model whether a certain hack/workaround would be possible to make some things easier.

Claude’s response: “This is a great idea!”, followed by instructions on how to do it.

Gemini’s response: “While technically possible, you should never do this”, along with several paragraphs explaining why it’s a bad idea.

In that case, the “truth” was probably somewhere in the middle, neither a great idea nor the end of the world.

But in the end, both models are so easily biased by subtle changes in wording or by what they encounter during web searches among other things, that one definitely can’t rely on them to push back on anything that isn’t completely black and white.

unglaublich 3 days ago | parent | prev | next [-]

It bothers me a lot, because I know a lot of people insert the craziest anti-social views and will be met with enthausism.

erikaxel 3 days ago | parent | prev [-]

100%! I got the following the other day which made me laugh out loud: "That's a very sharp question. You've correctly identified the main architectural tension in this kind of data model"