Remix.run Logo
meowface 5 days ago

I am skeptical that any model can actually determine what sort of prompts will have what effects on itself. It's basically always guessing / confabulating / hallucinating if you ask it an introspective question like that.

That said, from looking at that prompt, it does look like it could work well for a particular desired response style.

upperhalfplane 5 days ago | parent | next [-]

> It's basically always guessing / confabulating / hallucinating if you ask it an introspective question like that.

You're absolutely right! This is the basis of this recent paper https://www.arxiv.org/abs/2506.06832

ehnto 5 days ago | parent | prev | next [-]

That is true of everything an LLM outputs, which is why the human in the loop matters. The zeitgeist seems to have moved on from this idea though.

meowface 5 days ago | parent | next [-]

It is true of everything it outputs, but for certain questions we know ahead of time it will always confabulate (unless it's smart enough, or instructed, to say "I don't know"). Like "how many parameters do you have?" or "how much data were you trained on?" This is one of those cases.

wongarsu 5 days ago | parent | next [-]

Yeah, but I wouldn't count "Which prompt makes you more truthful and logical" amongst those.

The questions it will always confabulate are those that are unknowable from the training data. For example even if I give the model a sense of "identity" by telling it in the system prompt "You are GPT6, a model by OpenAI" the training data will predate any public knowledge of GPT6 and thus not include any information about the number of parameters of this model.

On the other hand "How do I make you more truthful" can reasonably be assumed to be equivalent to "How do I make similar LLMs truthful", and there is lots of discussion and experience on that available in forum discussions, blog posts and scientific articles, all available in the training data. That doesn't guarantee good responses and the responses won't be specific to this exact model, but the LLM has a fair chance to one-shot something that's better than my one-shot.

ElFitz 5 days ago | parent | prev [-]

Even when instructed to say "I don’t know" it is just as likely to make up an answer instead, or say it "doesn’t know" when the data is actually present somewhere in its weights.

codeflo 5 days ago | parent [-]

That's because the architecture isn't built for it to know what it knows. As someone put it, LLMs always hallucinate, but for in-distribution data they mostly hallucinate correctly.

bluefirebrand 5 days ago | parent | next [-]

My vibe has it mostly hallucinates incorrectly

I really do wonder what the difference is. Am I using it wrong? Am I just unlucky? Do other people just have lower standards?

I really don't know. I'm getting very frustrated though because I feel like I'm missing something.

Wojtkie 5 days ago | parent [-]

It's highly task specific.

I've been refactoring a ton of my Pandas code into Polars and using ChatGPT on the side as a documentation search and debugging tool.

It keeps hallucinating things about the docs, methods, and args for methods, even after changing my prompt to be explicit about doing it only with Polars.

I've noticed similar behavior with other libraries that aren't the major ones. I can't imagine how much it gets wrong with a less popular language.

5 days ago | parent | prev [-]
[deleted]
lotyrin 5 days ago | parent | prev [-]

The projection and optimism people are willing to do is incredible.

The fallout on reddit in the wake of the push for people to adopt 5 and how the vibe isn't as nice and it makes it harder to use it as a therapist or girlfriend or whatever, for instance is incredible. And from what I've heard of internal sentiment from OpenAI about how they have concerns about usage patterns, that was a VERY intentional effect.

Many people trust the quality of the output way too much and it seems addictive to people (some kind of dopamine hit from deferring the need to think for yourself or something) such that if I suggest things in my professional context like not wholesale putting it in charge of communications with customers without including evaluations or audits or humans in the loop it's as if I told them they can't go for their smoke break and their baby is ugly.

And that's not to go into things like "awakened" AI or the AI "enlightenment" cults that are forming.

leodiceaa 5 days ago | parent [-]

> use it as a therapist or girlfriend or whatever

> it seems addictive to people (some kind of dopamine hit from deferring the need to think for yourself or something)

I think this whole thing has more to do with validation. Rigorous reasoning is hard. People found a validation machine and it released them from the need to be rigorous.

These people are not "having therapy", "developing relationships", they are fascinated by a validation engine. Hence the repositories full of woo woo physics as well, and why so many people want to believe there's something more there.

The usage of LLMs at work, in government, policing, coding, etc is so concerning because of that. They will validate whatever poor reasoning people throw at them.

pjc50 5 days ago | parent | next [-]

We've automated a yes-man. That's why it's going to make a trillion dollars selling to corporate boards.

kibwen 5 days ago | parent [-]

How long until shareholders elect to replace those useless corporate boards and C-level executives with an LLM? I can think of multiple megacorporations that would be improved by this process, to say nothing of the hundreds of millions in cost savings.

aspenmayer 5 days ago | parent | prev [-]

> These people are not "having therapy", "developing relationships", they are fascinated by a validation engine. Hence the repositories full of woo woo physics as well, and why so many people want to believe there's something more there.

> The usage of LLMs at work, in government, policing, coding, etc is so concerning because of that. They will validate whatever poor reasoning people throw at them.

These machines are too useful not to exist, so we had to invent them.

https://en.wikipedia.org/wiki/The_Unaccountability_Machine

> The Unaccountability Machine (2024) is a business book by Dan Davies, an investment bank analyst and author, who also writes for The New Yorker. It argues that responsibility for decision making has become diffused after World War II and represents a flaw in society.

> The book explores industrial scale decision making in markets, institutions and governments, a situation where the system serves itself by following process instead of logic. He argues that unexpected consequences, unwanted outcomes or failures emerge from "responsibility voids" that are built into underlying systems. These voids are especially visible in big complex organizations.

> Davies introduces the term “accountability sinks”, which remove the ownership or responsibility for decisions made. The sink obscures or deflects responsibility, and contributes towards a set of outcomes that appear to have been generated by a black box. Whether a rule book, best practices, or computer system, these accountability sinks "scramble feedback" and make it difficult to identify the source of mistakes and rectify them. An accountability sink breaks the links between decision makers and individuals, thus preventing feedback from being shared as a result of the system malfunction. The end result, he argues, is protocol politics, where there is no head, or accountability. Decision makers can avoid the blame for their institutional actions, while the ordinary customer, citizen or employee face the consequences of these managers poor decision making.

Wojtkie 5 days ago | parent [-]

I've been thinking about "accountability sinks" a lot lately and how LLMs further the issue. I have never heard of this book or author prior to this comment. I'll definitely have to read it!

lm28469 5 days ago | parent | prev | next [-]

100%, it reminds me of this post I saw yesterday about how chatgpt confirmed "in its own words" it is a CIA/FBI honeypot:

https://www.reddit.com/r/MKUltra/comments/1mo8whi/chatgpt_ad...

When talking to an LLM you're basically talking to yourself, that's amazing if you're a knowledgeable dev working on a dev task, not so much if you're mentally ill person "investigating" conspiracy theories.

That's why HNers and tech people in general overestimate the positive impact of LLMs while completely ignoring the negative sides... they can't even imagine half of the ways people use these tools in real life.

bluefirebrand 5 days ago | parent [-]

I find this really sad actually

Is it really so difficult to imagine how people will use (or misuse) tools you build? Are HNers or tech people in general just very idealistic and naive?

Maybe I'm the problem though. Maybe I'm a bad person that is always imagining how many bad ways I would abuse any kind of system or power that I can, even though I don't have any actual intention to actually abuse systems

lm28469 5 days ago | parent | next [-]

> Are HNers or tech people in general just very idealistic and naive?

Most of us are terminally online and/or in a set of concentric bubbles that makes us completely oblivious to most of the real world. You know the quote about "If the only tool you have is a hammer, ..." it's the same thing here for software.

Wojtkie 5 days ago | parent | prev | next [-]

It's the false consensus effect.

assword 5 days ago | parent | prev [-]

[dead]

anothernewdude 5 days ago | parent | prev | next [-]

Perhaps. On the other hand it's working in the same embedding space to produce text as it is reading in a prompt.

LLMs are always guessing and hallucinating. It's just how they work. There's no "True" to an LLM, just how probable tokens are given previous context.

bnegreve 5 days ago | parent [-]

> There's no "True" to an LLM, just how probable tokens are given previous context.

It may be enough: tool assisted LLMs already know when to use tools such as calculators or question answering systems when hallucinating an answer is likely to impact next token prediction error.

So next-token prediction error incentivize them to seek for true answers.

That doesn't guaranty anything of course, but if we were only interested in provably correct answers we would be working on theorem provers, not on LLMs

silon42 5 days ago | parent | prev [-]

Surely there are prompts on the "internet" that it will borrow from...

vineyardmike 5 days ago | parent [-]

Definitionally no.

Each LLM responds to prompts differently. The best prompts to model X will not be in the training data for model X.

Yes, older prompts for older models can still be useful. But if you asked ChatGPT before GPT-5, you were getting a response from GPT-4 which had a knowledge cutoff around 2022, which is certainly not recent enough to find adequate prompts in the training data.

There are also plenty of terrible prompts on the internet, so I still question a recent models ability to write meaningful prompts based on its training data. Prompts need to be tested for their use-case, and plenty of medium posts from self-proclaimed gurus and similar training data junk surely are not tested against your use case. Of course, the model is also not testing the prompt for you.

meowface 5 days ago | parent [-]

Exactly.

I wasn't trying to make any of the broader claims (e.g., that LLMs are fundamentally unreliable, which is sort of true but not really that true in practice). I'm speaking about the specific case where a lot of people seem to want to ask a model about itself or how it was created or trained or what it can do or how to make it do certain things. In these particular cases (and, admittedly, many others) they're often eager to reply with an answer despite having no accurate information about the true answer, barring some external lookup that happens to be 100% correct. Without any tools, they are just going to give something plausible but non-real.

I am actually personally a big LLM-optimist and believe LLMs possess "true intelligence and reasoning", but I find it odd how some otherwise informed people seem to think any of these models possess introspective abilities. The model fundamentally does not know what it is or even that it is a model - despite any insistence to the contrary, and even with a lot of relevant system prompting and LLM-related training data.

It's like a Boltzmann brain. It's a strange, jagged entity.