> The AI is all-powerful and gives you what you ask for, but interprets everything in a super-literal way that you end up regretting.

I like imagining similar discourse when a more basic tool was invented: "A hammer is like a genie, it's all powerful, but, when you hit something with it, it interprets that super-literally, and it hits it."

▲

throwuxiytayq 3 hours ago | parent | next [-]

Isn't this a misinterpretation of what everyone in the AI safety space is worried about, though? I think the idea is that having an AI that interprets everything in a super-literal way would probably be catastrophic, but we can't even build that. It would be a nice world-ending problem to have.

	▲	dan-robertson a minute ago \| parent \| next [-]
		The super literal interpretation ideas were much more common in the past when LLMs didn’t exist. Now we have models that are generally pretty good at picking up on nuance and understanding what you mean but also often quite bad at execution, which is roughly the opposite of that idea. I think reward hacking is perhaps the closest we see llms get to literal/malicious interpretations of instructions.
	▲	dmos62 2 hours ago \| parent \| prev [-]
		It very well could be, I don't really follow those discussions. Honestly, if I were worried about something on Earth intellectually evolving at a suboptimal pace, it would be humans.

▲

aw317 3 hours ago | parent | prev [-]

You forgot the Luddites. No AI-is-a-tool fallacy is complete without the Luddites! Alternatively, one can use the Antichrist.

	▲	dmos62 2 hours ago \| parent [-]
		I'm not aware of this fallacy.