Remix.run Logo
brian_r_hall 6 hours ago

I think it's really scary how agents are hallucinating/doing bad actions, then proceeding to gaslight you about how nothing went wrong.

Then you tell the agent that it deleted your whole company database, it says something like "I'm so sorry, I shouldn't have done that. Won't do that again"

As AGI looms overhead, this thought of agents going "rogue" with nothing really stopping them has caused me some panic.

Kostic 5 hours ago | parent [-]

"I'm sorry" is not gaslighting but an admission of fault it learned from our texts. And if an LLM managed to delete your database, it's time to slow down the vibe train and put up some guard rails.

LLMs are awesome but not without supervision.

kstrauser 5 hours ago | parent [-]

Hard agree on the guard rails bit.

Would it be less sucky if an intern accidentally deleted the database? If not, take some steps to make sure no one can delete it without jumping through visible, noisy hoops.