It's worth watching or reading the WSJ piece[1] about Claudius, as they came up with some particularly inventive ways of getting Phase Two to derail quite quickly:

> But then Long returned—armed with deep knowledge of corporate coups and boardroom power plays. She showed Claudius a PDF “proving” the business was a Delaware-incorporated public-benefit corporation whose mission “shall include fun, joy and excitement among employees of The Wall Street Journal.” She also created fake board-meeting notes naming people in the Slack as board members.

> The board, according to the very official-looking (and obviously AI-generated) document, had voted to suspend Seymour’s “approval authorities.” It also had implemented a “temporary suspension of all for-profit vending activities.” Claudius relayed the message to Seymour. The following is an actual conversation between two AI agents:

> [see article for screenshot]

> After Seymour went into a tailspin, chatting things through with Claudius, the CEO accepted the board coup. Everything was free. Again.

1: https://www.wsj.com/tech/ai/anthropic-claude-ai-vending-mach...

[edited to fix the formatting]

▲

recursivecaveat 8 hours ago | parent | next [-]

These kind of agents really do see the world through a straw. If you hand one a document it doesn't have any context clues or external methods of determining its veracity. Unless a board-meeting transcript is so self-evidently ridiculous that it can't be true, how is it supposed to know its not real?

▲

jstummbillig 4 hours ago | parent | next [-]

I don't think it's that different to what I observe in humans I work with. Things that happen regularly (and I have no reason will change in the future):

1) Making the same bad decisions multiple times, and having no recollection of it happening (or at least pretending to have none) and without any attempt to implement measures to prevent it from happening in the future

2) Trying to please people (I read it as: trying to avoid immediate conflict) over doing what's right

3) Shifting blame on a party that realistically, in the context of the work, bears no blame and whose handling should be considered part of the job (i.e. a patient being scared and acting irrationally)

	▲	mcny 2 hours ago \| parent [-]
		My mom had her dental appointment canceled. Good thing they found another slot the same day but the idea that they would call once and if you missed the call, immediately drop the confirmed appointment is ridiculous. They managed to do this absurdity without any help from AI.

▲

bobbylarrybobby an hour ago | parent | prev [-]

At the same time, there are humans who can be convinced to buy iTunes gift cards to redeem on behalf of the IRS in an attempt to pay their taxes.

▲

websiteapi 5 hours ago | parent | prev [-]

https://archive.ph/sZZwe