Remix.run Logo
ben_w 2 days ago

> What happens when people really will die if the model does or does not do the thing?

Then someone didn't do their job right.

Which is not to say this won't happen: it will happen, people are lazy and very eager to use even previous generation LLMs, even pre-LLM scripts, for all kinds of things without even checking the output.

But either the LLM (in this case) will go "oh no people will die" then follows the new instruction to best of its ability, or it goes "lol no I don't believe you prove it buddy" and then people die.

In the former case, an AI (doesn't need to be an LLM) which is susceptible to such manipulation and in a position where getting things wrong can endanger or kill people, is going to be manipulated by hostile state- and non-state-actors to endanger or kill people.

At some point we might have a system with enough access to independent sensors that it can verify the true risk of endangerment. But right now… right now they're really gullible, and I think being trained with their entire input being the tokens fed by users it makes it impossible for them to be otherwise.

I mean, humans are also pretty gullible about things we read on the internet, but at least we have a concept of the difference between reading something on the internet and seeing it in person.