Yep. Last night I was asking ChatGPT (4o) to help me generate a simple HTML canvas that users could draw on. Multiple times, it spoke confidently of its not even kind of working solution (copying the text from the chat below):

- "Final FIXED & WORKING drawing.html" (it wasn't working at all)

- "Full, Clean, Working Version (save as drawing.html)" (not working at all)

- "Tested and works perfectly with: Chrome / Safari / Firefox" (not working at all)

- "Working Drawing Canvas (Vanilla HTML/JS — Save this as index.html)" (not working at all)

- "It Just Works™" (not working at all)

The last one was so obnoxious I moved over to Claude (3.5 Sonnet) and it knocked it out in 3-5 prompts.

▲

numpad0 3 months ago | parent | next [-]

IME, it's better to just delete erroneous responses and fix prompts until it works.

They are much better at fractally subdividing and interpreting inputs like a believer of a religion, than at deconstructing and iteratively improving things like an engineert. It's waste of token count trying to have such discussions with an LLM.

▲

Aeolun 3 months ago | parent | prev | next [-]

4o is almost laughably bad at code compared to Claude.

▲

dullcrisp 3 months ago | parent | prev [-]

To be fair, I wouldn't really expect working software if someone described it that way either.

▲

rglover 3 months ago | parent [-]

Those are not my prompts. Those were the headings it put above the code it generated in its responses.

Even if my prompt was low-quality, it doesn't matter. It's confidently stating that what it produced was both tested and working. I personally understand that's not true, but of all the safety guards they should be putting in place, not lying should be near the top of the list.

	▲	mattgreenrocks 3 months ago \| parent [-]
		Intellectual humility is just as rare with AI as it is with humans.