▲ | simonw 6 days ago | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> This concludes all the testing for GPT5 I have to do. If a tool is able to actively mislead me this easy, which potentially results in me wasting significant amounts of time in trying to make something work that is guaranteed to never work, it’s a useless tool. Yeah, except it isn't. You can get enormous value out of LLMs if you get over this weird science fiction requirement that they never make mistakes. And yeah, their confidence is frustrating. Treat them like an over-confident twenty-something intern who doesn't like to admit when they get stuff wrong. You have to put the effort in to learn how to use them with a skeptical eye. I've been getting value as a developer from LLMs since the GPT-3 era, and those models sucked. > The only way ChatGPT will stop spreading that nonsense is if there is a significant mass of humans talking online about the lack of ZSTD support. We actually have a robust solution for this exact problem now: run the prompt through a coding agent of some sort (Claude Code, Codex CLI, Cursor etc) that has access to the Swift compiler. That way it can write code with the hallucinated COMPRESSION_ZSTD thing in it, observe that it doesn't compile and iterate further to figure out what does work. Or the simpler version of the above: LLM writes code. You try and compile it. You get an error message and you paste that back into the LLM and let it have another go. That's been the main way I've worked with LLMs for almost three years now. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | jpc0 6 days ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
If that same intern, when asked something, responded that they checked, gave you a link to a document they claim have the proof / answer but does in fact no and continued to do that they wouldn’t be an intern very long. But somehow this is acceptable behaviour for an AI? I use AI for sure, but only on things that I can easily verify is correct (run a test or some code ), because I have had the AI give me functions in an API with links to online documentation for those functions, the document exists, the function is not in it, when called out instead of doing a basic tool call the AI will double down that it is correct and you the human are wrong. That would get an intern fired but here you are standing on the interns side. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | 6 days ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
[deleted] |