> This concludes all the testing for GPT5 I have to do. If a tool is able to actively mislead me this easy, which potentially results in me wasting significant amounts of time in trying to make something work that is guaranteed to never work, it’s a useless tool.

Yeah, except it isn't. You can get enormous value out of LLMs if you get over this weird science fiction requirement that they never make mistakes.

And yeah, their confidence is frustrating. Treat them like an over-confident twenty-something intern who doesn't like to admit when they get stuff wrong.

You have to put the effort in to learn how to use them with a skeptical eye. I've been getting value as a developer from LLMs since the GPT-3 era, and those models sucked.

> The only way ChatGPT will stop spreading that nonsense is if there is a significant mass of humans talking online about the lack of ZSTD support.

We actually have a robust solution for this exact problem now: run the prompt through a coding agent of some sort (Claude Code, Codex CLI, Cursor etc) that has access to the Swift compiler.

That way it can write code with the hallucinated COMPRESSION_ZSTD thing in it, observe that it doesn't compile and iterate further to figure out what does work.

Or the simpler version of the above: LLM writes code. You try and compile it. You get an error message and you paste that back into the LLM and let it have another go. That's been the main way I've worked with LLMs for almost three years now.

▲

jpc0 6 days ago | parent | next [-]

If that same intern, when asked something, responded that they checked, gave you a link to a document they claim have the proof / answer but does in fact no and continued to do that they wouldn’t be an intern very long. But somehow this is acceptable behaviour for an AI?

I use AI for sure, but only on things that I can easily verify is correct (run a test or some code ), because I have had the AI give me functions in an API with links to online documentation for those functions, the document exists, the function is not in it, when called out instead of doing a basic tool call the AI will double down that it is correct and you the human are wrong. That would get an intern fired but here you are standing on the interns side.

▲

simonw 6 days ago | parent | next [-]

Because LLMs aren't human beings.

I wrote a note about that here: https://simonwillison.net/2025/Mar/11/using-llms-for-code/#s...

> Don’t fall into the trap of anthropomorphizing LLMs and assuming that failures which would discredit a human should discredit the machine in the same way.

▲

jpc0 6 days ago | parent | next [-]

> And yeah, their confidence is frustrating. Treat them like an over-confident twenty-something intern who doesn't like to admit when they get stuff wrong.

I was explicitly calling out this comment, that intern would get fired if when explicitly called out they not only don’t want to admit they are wrong but vehemently disagree.

The interaction was “Implement X”, it gave an implementation, I responded “function y does not exist use a different method”, it instead of following that instruction gave me a link to the documentation for the library that it claim’s contains that function and told me I am wrong.

I said the documentation it linked does not contain that function and to do something different and yet it still refused to follow instructions and pushed back.

At that point I “fired” it and wrote the code myself.

▲

jpc0 6 days ago | parent | prev [-]

> Or the simpler version of the above: LLM writes code. You try and compile it. You get an error message and you paste that back into the LLM and let it have another go. That's been the main way I've worked with LLMs for almost three years now.

I’m going to comment here about this but it’s a follow on to the other comment, this is exactly the workflow I was following. I had given it the compiler error and it blamed an environment issue, I confirmed the environment is as it claims it should be, it linked to documentation that doesn’t state what it claims is stated.

In a coding agent this would have been an endless feedback loop that eats millions of tokens.

This is the reason why I do not use coding agents, I can catch hallucinations and stop the feedback loop from ever happening in the first place without needing to watch an AI agent try to convince itself that it is correct and the compiler must be wrong.

▲

elliotto 6 days ago | parent [-]

You're responding to simonw, the guy who could be considered the single leading voice in practical applications of llms, with an anecdote about how one time the bot gave you a compiler error, and llm coding is therefore useless.

▲

jpc0 5 days ago | parent [-]

I would agree with you if it was just one time, this is multiple times.

Also look up appeal to authority, everyone can be better, I respect Simon don’t get me wrong, but I don’t read names before I comment and you would probably do well to do the same, let what people say stand on their own.

▲

elliotto 5 days ago | parent [-]

I read what you wrote and let it stand on its own. Your comment mentioned a single time the bot had failed - there was no mention of it occurring multiple times. Maybe you should flesh out your anecdotes properly.

If you are treating anonymous posts on forums with the same authority as experts on those topics then it makes sense why you are struggling to use new tooling.

	▲	jpc0 5 days ago \| parent [-]
		> I use AI for sure, but only on things that I can easily verify is correct Where did I say I was struggling, I stated a preference, I even stated lived experience with using a coding agent and a chat based LLM interface and have a preference for the chat based version. I don’t struggle to use modern tooling, I pointed out that the analogy is terrible and how my lived experience proves that, I do use LLMs in anger daily for coding tasks, I just don’t treat them like interns. Many time the points I used indicate that what I am doing is not in the training set, when you work on hard problems you run into edge cases a little more reliably than when you don’t. Stop defending a man you don’t know on the internet, his words defend him well enough and I can hope he just stops using that analogy because it is fundamentally flawed, which is the point of my comments. The rest of his comment makes perfect sense and is the way I use LLMs and have to work around their idiosyncrasies anyway.

▲

raincole 6 days ago | parent | prev [-]

Wow... people unironically anthropomorphize AI to the point that they expect AI to work exactly like an human intern, otherwise it's unacceptable...

▲

6 days ago | parent | prev [-]

[deleted]