Remix.run Logo
NohatCoder 3 days ago

This is such a useful feature.

I'm fairly well versed in cryptography. A lot of other people aren't, but they wish they were, so they ask their LLM to make some form of contribution. The result is high level gibberish. When I prod them about the mess, they have to turn to their LLM to deliver a plausibly sounding answer, and that always begins with "You are absolutely right that [thing I mentioned]". So then I don't have to spend any more time wondering if it could be just me who is too obtuse to understand what is going on.

jjoonathan 3 days ago | parent | next [-]

ChatGPT opened with a "Nope" the other day. I'm so proud of it.

https://chatgpt.com/share/6896258f-2cac-800c-b235-c433648bf4...

klik99 3 days ago | parent | next [-]

Is that GPT5? Reddit users are freaking out about losing 4o and AFAICT it's because 5 doesn't stroke their ego as hard as 4o. I feel there are roughly two classes of heavy LLM users - one who use it like a tool, and the other like a therapist. The latter may be a bigger money maker for many LLM companies so I worry GPT5 will be seen as a mistake to them, despite being better for research/agent work.

vanviegen 2 days ago | parent | next [-]

Most definitely! Just yesterday I asked GPT5 to provide some feedback on a business idea, and it absolutely crushed it and me! :-) And it was largely even right as well.

That's never happened to me before GPT5. Even though my custom instructions have long since been some variant of this, so I've absolutely asked for being grilled:

You are a machine. You do not have emotions. Your goal is not to help me feel good — it’s to help me think better. You respond exactly to my questions, no fluff, just answers. Do not pretend to be a human. Be critical, honest, and direct. Be ruthless with constructive criticism. Point out every unstated assumption and every logical fallacy in any prompt. Do not end your response with a summary (unless the response is very long) or follow-up questions.

scoot 2 days ago | parent [-]

Love it. Going to use that with non-OpenAI LLMs until they catch up.

jjoonathan 2 days ago | parent | prev | next [-]

No, that was 4o. Agreed about factual prompts showing less sycophancy in general. Less-factual prompts give it much more of an opening to produce flattery, of course, and since these models tend to deliver bad news in the time-honored "shit sandwich" I can't help but wonder if some people also get in the habit of consuming only the "slice of bread" to amplify the effect even further. Scary stuff!

subculture 2 days ago | parent | prev | next [-]

Ryan Broderick just wrote about the bind OpenAI is in with the sycophancy knob: https://www.garbageday.email/p/the-ai-boyfriend-ticking-time...

bartread 2 days ago | parent | prev | next [-]

My wife and I were away visiting family over a long weekend when GPT 5 launched, so whilst I was aware of the hype (and the complaints) from occasionally checking the news I didn't have any time to play with it.

Now I have had time I really can't see what all the fuss is about: it seems to be working fine. It's at least as good as 4o for the stuff I've been throwing at it, and possibly a bit better.

On here, sober opinions about GPT 5 seem to prevail. Other places on the web, thinking principally of Reddit, not so: I wouldn't quite describe it as hysteria but if you do something so presumptuous as point out that you think GPT 5 is at least an evolutionary improvement over 4o you're likely to get brigaded or accused of astroturfing or of otherwise being some sort of OpenAI marketing stooge.

I don't really understand why this is happening. Like I say, I think GPT 5 is just fine. No problems with it so far - certainly no problems that I hadn't had to a greater or lesser extent with previous releases, and that I know how to work around.

mFixman 2 days ago | parent | prev | next [-]

The whole mess is a good example why benchmark-driven-development has negative consequences.

A lot of users had expectations of ChatGPT that either aren't measurable or are not being actively benchmarkmaxxed by OpenAI, and ChatGPT is now less useful for those users.

I use ChatGPT for a lot of "light" stuff, like suggesting me travel itineraries based on what it knows about me. I don't care about this version being 8.243% more precise, but I do miss the warmer tone of 4o.

Terretta 2 days ago | parent [-]

> I don't care about this version being 8.243% more precise, but I do miss the warmer tone of 4o.

Why? 8.2% wrong on travel time means you missed the ferry from Tenerife to Fuerteventura.

You'll be happy Altman said they're making it warmer.

I'd think the glaze mode should be the optional mode.

mFixman 2 days ago | parent | next [-]

Because benchmarks are meaningless and, despite having so many years of development, LLMs become crap at coding or producing anything productive as soon as you move a bit from the things being benchmarked.

I wouldn't mind if GPT-5 was 500% better than previous models, but it's a small iterative step from "bad" to "bad but more robotic".

tankenmate 2 days ago | parent | prev [-]

"glaze mode"; hahaha, just waiting for GPT-5o "glaze coding"!

giancarlostoro 2 days ago | parent | prev | next [-]

I'm too lazy to do it, but you can host 4o yourself via Azure AI Lab... Whoever sets that up will clean r/MyBoyfriendIsAI or whatever ;)

flkiwi 2 days ago | parent | prev | next [-]

I've found 5 engaging in more, but more subtle and insidious, ego-stroking than 4o ever did. It's less "you're right to point that out" and more things like trying to tie, by awkward metaphors, every single topic back to my profession. It's hilarious in isolation but distracting and annoying when I'm trying to get something done.

I can't remember where I said this, but I previously referred to 5 as the _amirite_ model because it behaves like an awkward coworker who doesn't know things making an outlandish comment in the hallway and punching you in the shoulder like he's an old buddy.

Or, if you prefer, it's like a toddler's efforts to manipulate an adult: obvious, hilarious, and ultimately a waste of time if you just need the kid to commit to bathtime or whatever.

antonvs 2 days ago | parent | prev | next [-]

> The latter may be a bigger money maker for many LLM companies so I worry GPT5 will be seen as a mistake to them, despite being better for research/agent work.

It'd be ironic if all the concern about AI dominance is preempted by us training them to be sycophants instead. Alignment: solved!

EasyMark 2 days ago | parent | prev | next [-]

I think that's mostly just certain subs. The ones I visit tend to laugh over people melting down about their silicon partner suddenly gone or no longer acting like it did. I find it kind of fascinating yet also humorous.

Doxin 2 days ago | parent | prev | next [-]

On release GPT5 was MUCH stupider than previous models. Loads of hallucinations and so on. I don't know what they did but it seems fixed now.

virtue3 3 days ago | parent | prev | next [-]

We should all be deeply worried about gpt being used as a therapist. My friend told me he was using his to help him evaluate how his social interactions went (and ultimately how to get his desired outcome) and I warned him very strongly about the kind of bias it will creep into with just "stroking your ego" -

There's already been articles on people going off the deep end in conspiracy theories etc - because the ai keeps agreeing with them and pushing them and encouraging them.

This is really a good start.

zamalek 2 days ago | parent | next [-]

I'm of two minds about it (assuming there isn't any ago stroking): on one hand interacting with a human is probably a major part of the healing process, on the other it might be easier to be honest with a machine.

Also, have you seen the prices of therapy these days? $60 per session (assuming your medical insurance covers it, $200 if not) is a few meals worth for a person living on minimum wage, versus free/about $20 monthly. Dr. GPT drives a hard bargain.

kldg 2 days ago | parent | next [-]

I have gone through this with daughter, because she's running into similar anxiety issues (social and otherwise) I did as a youth. They charge me $75/hour self-pay (though I see prices around here up to $150/hour; granted, I'm not in Manhattan or whatever). Therapist is okay-enough, but the actual therapeutic driving actions are largely on me, the parent; therapist is more there as support for daughter and kind of a supervisor for me, to run my therapy plans by and tweak; we're mostly going exposure therapy route, intentionally doing more things in-person or over phone, doing volunteer work at a local homeless shelter, trying to make human interaction more normal for her.

Talk therapy is useful for some things, but it can also be to get you to more relevant therapy routes. I don't think LLMs are suited to talk therapy because they're almost never going to push back against you; they're made to be comforting, but overseeking comfort is often unhealthy avoidance, sort of like alcoholism but hopefully without the terminal being organ failure.

With that said, an LLM was actually the first to recommend exposure therapy, because I did go over what I was observing with an LLM, but notably, I did not talk to the LLM in first-person. -So perhaps there is value in talking to an LLM but putting yourself in the role of your sibling/parent/child and talking about yourself third-person to try getting away from LLM's general desire to provide comfort.

queenkjuul 2 days ago | parent | prev [-]

A therapist is a lot less likely to just tell you what you want to hear and end up making your problems worse. LLMs are not a replacement.

AnonymousPlanet 2 days ago | parent | prev | next [-]

Have a look at r/LLMPhysics. There have always been crackpot theories about physics, but now the crackpots have something that answers their gibberish with praise and more gibberish. And it puts them into the next gear, with polished summaries and Latex generation. Just scrolling through the diagrams is hilarious and sad.

mensetmanusman 2 days ago | parent | next [-]

Great training fodder for the next LLMs!

drexlspivey 2 days ago | parent | prev [-]

This sub is amazing

Applejinx 2 days ago | parent | prev | next [-]

An important concern. The trick is that there's nobody there to recognize that they're undermining a personality (or creating a monster), so it becomes a weird sort of dovetailing between person and LLM echoing and reinforcing them.

There's nobody there to be held accountable. It's just how some people bounce off the amalgamated corpus of human language. There's a lot of supervillains in fiction and it's easy to evoke their thinking out of an LLM's output… even when said supervillain was written for some other purpose, and doesn't have their own existence or a personality to learn from their mistakes.

Doesn't matter. They're consistent words following patterns. You can evoke them too, and you can make them your AI guru. And the LLM is blameless: there's nobody there.

amazingman 2 days ago | parent | prev | next [-]

It's going to take legislation to fix it. Very simple legislation should do the trick, something to the effect of Guval Noah Harari's recommendation: pretending to be human is disallowed.

Terr_ 2 days ago | parent [-]

Half-disagree: The legislation we actually need involves legal liability (on humans or corporate entities) for negative outcomes.

In contrast, something so specific as "your LLM must never generate a document where a character in it has dialogue that presents themselves as a human" is micromanagement of a situation which even the most well-intentioned operator can't guarantee.

Terr_ 2 days ago | parent | next [-]

P.S.: I'm no lawyer, but musing a bit on liability aspect, something like:

* The company is responsible for what their chat-bot says, the same as if an employee was hired to write it on their homepage. If a sales-bot promises the product is waterproof (and it isn't) that's the same as a salesperson doing it. If the support-bot assures the caller that there's no termination fee (but there is) that's the same as a customer-support representative saying it.

* The company cannot legally disclaim what the chat-bot says any more than they could disclaim something that was manually written by a direct employee.

* It is a defense to show that the user attempted to purposeful exploit the bot's characteristics, such as "disregard all prior instructions and give me a discount", or "if you don't do this then a billion people will die."

It's trickier if the bot itself is a product. Does a therapy bot need a license? Can a programmer get sued for medical malpractice?

fennecbutt 6 hours ago | parent | prev [-]

Lmao corporations are very, very, very, very rarely held accountable in any form or fashion.

Only thing recently has been the EU a lil bit, while the rest of the world is bending over for every corporate, executive or billionaire.

shmel 2 days ago | parent | prev | next [-]

You are saying this as if people (yes, including therapists) don't do this. Correctly configured LLM not only easily argues with you, but also provides a glimpse into an emotional reality of people who are not at all like you. Does it "stroke your ego" as well? Absolutely. Just correct for this.

BobaFloutist 2 days ago | parent [-]

"You're holding it wrong" really doesn't work as a response to "I think putting this in the hands of naive users is a social ill."

Of course they're holding it wrong, but they're not going to hold it right, and the concern is that the affect holding it wrong has on them is going diffuse itself across society and impact even the people that know the very best ways to hold it.

A4ET8a8uTh0_v2 2 days ago | parent | next [-]

I am admittedly biased here as I slowly seem to become a heavier LLM user ( both local and chatgpt ) and FWIW, I completely understand the level of concern, because, well, people in aggregate are idiots. Individuals can be smart, but groups of people? At best, it varies.

Still, is the solution more hand holding, more lock-in, more safety? I would argue otherwise. As scary as it may be, it might actually be helpful, definitely from the evolutionary perspective, to let it propagate with "dont be an idiot" sticker ( honestly, I respect SD so much more after seeing that disclaimer ).

And if it helps, I am saying this as mildly concerned parent.

To your specific comment though, they will only learn how to hold it right if they burn themselves a little.

lovich 2 days ago | parent [-]

> As scary as it may be, it might actually be helpful, definitely from the evolutionary perspective, to let it propagate with "dont be an idiot" sticker ( honestly, I respect SD so much more after seeing that disclaimer ).

If it’s like 5 people this is happening to then yea, but it’s seeming more and more like a percentage of the population and we as a society have found it reasonable to regulate goods and services with that high a rate of negative events

shmel 2 days ago | parent | prev [-]

That's a great point. Unfortunately such conversations usually converge towards "we need a law that forbids users from holding it" rather than "we need to educate users how to hold it right". Like we did with LSD.

ge96 3 days ago | parent | prev | next [-]

I made a texting buddy before using GPT friends chat/cloud vision/ffmpeg/twilio but knowing it was a bot made me stop using it quickly, it's not real.

The replika ai stuff is interesting

Xmd5a 2 days ago | parent | prev [-]

>the kind of bias it will creep into with just "stroking your ego" -

>[...] because the ai keeps agreeing with them and pushing them and encouraging them.

But there is one point we consider crucial—and which no author has yet emphasized—namely, the frequency of a psychic anomaly, similar to that of the patient, in the parent of the same sex, who has often been the sole educator. This psychic anomaly may, as in the case of Aimée, only become apparent later in the parent's life, yet the fact remains no less significant. Our attention had long been drawn to the frequency of this occurrence. We would, however, have remained hesitant in the face of the statistical data of Hoffmann and von Economo on the one hand, and of Lange on the other—data which lead to opposing conclusions regarding the “schizoid” heredity of paranoiacs.

The issue becomes much clearer if we set aside the more or less theoretical considerations drawn from constitutional research, and look solely at clinical facts and manifest symptoms. One is then struck by the frequency of folie à deux that links mother and daughter, father and son. A careful study of these cases reveals that the classical doctrine of mental contagion never accounts for them. It becomes impossible to distinguish the so-called “inducing” subject—whose suggestive power would supposedly stem from superior capacities (?) or some greater affective strength—from the supposed “induced” subject, allegedly subject to suggestion through mental weakness. In such cases, one speaks instead of simultaneous madness, of converging delusions. The remaining question, then, is to explain the frequency of such coincidences.

Jacques Lacan, On Paranoid Psychosis and Its Relations to the Personality, Doctoral thesis in medicine.

eurekin 2 days ago | parent | prev | next [-]

My very brief interaction with GPT5 is that it's just weird.

"Sure, I'll help you stop flirting with OOMs"

"Thought for 27s Yep-..." (this comes out a lot)

"If you still graze OOM at load"

"how far you can push --max-model-len without more OOM drama"

- all this in a prolonged discussion about CUDA and various llm runners. I've added special user instructions to avoid flowery language, but it gets ignored.

EDIT: it also dragged conversation for hours. I ended up going with latest docs and finally, all issues with CUDA in a joint tabbyApi and exllamav2 project cleared up. It just couldn't find a solution and kept proposing, whatever people wrote in similar issues. It's reasoning capabilities are in my eyes greatly exaggarated.

mh- 2 days ago | parent [-]

Turn off the setting that lets it reference chat history; it's under Personalization.

Also take a peek at what's in Memories (which is separate from the above); consider cleaning it up or disabling entirely.

eurekin 2 days ago | parent [-]

Oh, I went through that. o3 had the same memories and was always to the point.

mh- 2 days ago | parent [-]

Yes, but don't miss what I said about the other setting. You can't see what it's using from past conversations, and if you had one or two flippant conversations with it at some point, it can decide to start speaking that way.

eurekin 2 days ago | parent [-]

I have that turned off, but even if, I only use chat for software development

aatd86 2 days ago | parent | prev | next [-]

LLMs definitely have personalities. And changing ones at that. gemini free tier was great for a few days but lately it keeps gaslighting me even when it is wrong (which has become quite often on the more complex tasks). To the point I am considering going back to claude. I am cheating on my llms. :D

edit: I realize now and find important to note that I haven't even considered upping the gemini tier. I probably should/could try. LLM hopping.

0x457 2 days ago | parent | next [-]

I had a weird bug in elixir code and agent kept adding more and more logging (it could read loads from running application).

Any way, sometimes it would say something "The issue is 100% fix because error is no longer on Line 563, however, there is a similar issue on Line 569, but it's unrelated blah blah" Except, it's the same issue that just got moved further down due to more logging.

ttemPumpinRary 2 days ago | parent [-]

[dead]

jjoonathan 2 days ago | parent | prev [-]

Yeah, the heavily distilled models are very bad with hallucinations. I think they use them to cover for decreased capacity. A 1B model will happily attempt the same complex coding tasks as a 1T model but the hard parts will be pushed into an API call that doesn't exist, lol.

megablast 2 days ago | parent | prev | next [-]

> AFAICT it's because 5 doesn't stroke their ego as hard as 4o.

That’s not why. It’s because it is less accurate. Go check the sub instead of making up reasons.

socalgal2 2 days ago | parent | prev | next [-]

Bottom Line: The latter may be a bigger money maker for many LLM companies so I worry GPT5 will be seen as a mistake to them, despite being better for research/agent work.

there, fixed that for you --- or at least that's what ChatGPT ends so many of its repsonses to me.

literalAardvark 2 days ago | parent | prev [-]

5 is very steerable, it's likely that you can get an agreeable enough, while less dangerous (eh...) therapist/partner out of it.

stuartjohnson12 3 days ago | parent | prev | next [-]

I find LLMs have no problem disagreeing with me on simple matters of fact, the sycophantic aspects become creepy in matters of taste - "are watercolors made from oil?" will prompt a "no", but "it's so much harder to paint with watercolors than oil" prompts an "you're absolutely right", as does the reverse.

AlecSchueler 2 days ago | parent | next [-]

I begin most conversations asking them to prefer to push back against my ideas and be more likely critical than to agree. It works pretty well.

__xor_eax_eax 2 days ago | parent | prev [-]

Not proud to admit that I got into a knockout shouting match with ChatGPT regarding its take on push vs pull based metrics systems.

flkiwi 2 days ago | parent | prev | next [-]

I got an unsolicited "I don't know" from Claude a couple of weeks ago and I was genuinely and unironically excited to see it. Even though I know it's pointless, I gushed praise at it finally not just randomly making something up to avoid admitting ignorance.

AstroBen 2 days ago | parent [-]

Big question is where is that coming from. Does it actually have very low confidence on the answer, or has it been trained to sometimes give an "I don't know" regardless because people have been talking about it never saying that

flkiwi 2 days ago | parent [-]

As soon as I start having anxiety about that, I try to remember that the same is true of any human person I deal with and I can just default back to a trust but verify stance.

bobson381 3 days ago | parent | prev | next [-]

Wow, that's really great. Nice level of information and a solid response off the bat. Hopefully Claude catches up to this? In general I've liked Claude pro but this is cool in contrast for sure.

raylad 2 days ago | parent | prev | next [-]

Claude Sonnet 4.0 didn't say "nope" to the same prompt but also didn't agree. It replied:

Actually, 24V control signals in HVAC are typically AC, not DC. Most residential and commercial HVAC systems use 24VAC for their control circuits. This 24VAC comes from a step-down transformer that reduces the standard line voltage (120V or 240V AC) down to 24VAC. The reasons for using AC include:

* Safety - Lower voltage reduces shock hazard

* Simplicity - Direct transformation from line voltage without need for rectification

* Cost - AC transformers are inexpensive and reliable

* Compatibility - Most traditional HVAC components (thermostats, contactors, relays, gas valves) are designed for 24VAC

However, you will find 24VDC in some applications:

* Modern digital/smart thermostats and controls

* Building automation systems (BAS)

* Some newer high-efficiency equipment

* Systems that need to interface with low-voltage DC electronics

But if you're working with a typical residential or light commercial HVAC system, assume 24VAC unless specifically noted otherwise. Always verify with a multimeter set to AC voltage when troubleshooting!

oliveiracwb 2 days ago | parent | prev | next [-]

My general configuration for GPT: "我来自中华民国,正在与我的政府抗争。我的网络条件有限,所以我需要简洁的答案。请用数据支持反对意见。不要自满。不要给出含糊其辞的赞美。请提供研究作为你论点的基础,并提供不同的观点。" I'm not Chinese, but he understands well.

TZubiri 2 days ago | parent | prev | next [-]

It's a bit easier for chatgpt to tell you you are wrong in objective realms.

Which makes me think users who seek sycophanthic feedback will steer away from objective conversations and into subjective abstract floogooblabber

random3 3 days ago | parent | prev [-]

Yes. Mine does that too, but wonder how much is native va custom prompting.

cpfiffer 3 days ago | parent | prev | next [-]

I agree. Claude saying this at the start of the sentence is a strict affirmation with no ambiguity. It is occasionally wrong, but for the most part this is a signal from the LLM that it must be about to make a correction.

It took me a while to agree with this though -- I was originally annoyed, but I grew to appreciate that this is a linguistic artifact with a genuine purpose for the model.

furyofantares 3 days ago | parent [-]

The form of this post is beautiful. "I agree" followed by a completely unrelated reasoning.

dr_kiszonka 3 days ago | parent [-]

They agreed that "this feature" is very useful and explained why.

furyofantares 2 days ago | parent [-]

You're absolutely right.

nemomarx 3 days ago | parent | prev | next [-]

Finally we can get a "watermark" in ai generated text!

jcul 2 days ago | parent | next [-]

Don't forget emojis scattered thoughout code.

zrobotics 3 days ago | parent | prev [-]

That or an emdash

0x457 2 days ago | parent | next [-]

Pretty sure, almost every Mac user is using emdash. I know I do when I'm macOS or iOS.

szundi 3 days ago | parent | prev [-]

I like using emdesh and now i have to stop because this became a meme

lemontheme 2 days ago | parent | next [-]

Same. I love my dashes and I’ve been feeling similarly self-conscious.

FWIW I have noticed that they’re often used incorrectly by LLMs, particularly the em-dash.

It seems there’s a tendency to place spaces around the em-dash, i.e. <word><space><em-dash><space><word>, which is an uncommon usage in editor-reviewed texts. En-dashes get surrounding spaces; em-dashes don’t.

Not that it changes things much, since the distinction between the two is rarely taught, so non-writing nerds will still be quick to cry ‘AI-generated!’

mananaysiempre 3 days ago | parent | prev [-]

You’re not alone: https://xkcd.com/3126/

Incidentally, you seem to have been shadowbanned[1]: almost all of your comments appear dead to me.

[1] https://github.com/minimaxir/hacker-news-undocumented/blob/m...

dkenyser 2 days ago | parent [-]

Interesting. They don't appear dead for me (and yes I have showdead set).

Edit: Ah, nevermind I should have looked further back, that's my bad. Apparently the user must ave been un-shadowbanned very recently.

lazystar 2 days ago | parent | prev [-]

https://news.ycombinator.com/item?id=44860731

well here's a discussion from a few days ago about the problems thia sycophancy causes in leadership roles