Remix.run Logo
mikeocool an hour ago

Kind of interesting that LLMs are basically being sold as having “human-like” reasoning capabilities, but in this case when “obamawhitehouse” asked to have it’s password reset sent to bob12345667@gmail.com the LLM didn’t question it and just triggered the process that happened to have a bug.

Humans support agents certainly fall prey to social engineering all the time, but I can’t think of a case where it was done on this scale so easily.