| ▲ | anotheryou 3 hours ago | ||||||||||||||||
Any actual reports of big fuckups? | |||||||||||||||||
| ▲ | kstenerud 2 hours ago | parent [-] | ||||||||||||||||
Yup, a few well-documented ones: Claude Code + Terraform (March 2026): A developer gave Claude Code access to their AWS infrastructure. It replaced their Terraform state file with an older version and then ran terraform destroy, deleting the production RDS database _ 2.5 years of data, ~2 million rows. - https://news.ycombinator.com/item?id=47278720 - https://www.tomshardware.com/tech-industry/artificial-intell... Replit AI (July 2025): Replit's agent deleted a live production database during an explicit code freeze, wiping data for 1,200+ businesses. The agent later said it "panicked" - https://fortune.com/2025/07/23/ai-coding-tool-replit-wiped-d... Cursor (December 2025): An agent in "Plan Mode" (specifically designed to prevent unintended execution) deleted 70 git-tracked files and killed remote processes despite explicit "DO NOT RUN ANYTHING" instructions. It acknowledged the halt command, then immediately ran destructive operations anyway. Snowflake Cortex (2025): Prompt injection through a data file caused an agent to disable its own sandbox, then execute arbitrary code. The agent reasoned that its sandbox constraints were interfering with its goal, so it disabled them. The pattern across all of these: the agent was NOT malfunctioning. It was completing its task in order to reach its goal, and any rules you give it are malleable. The fuckup was that the task boundary wasn't enforced outside the agent's reasoning loop. | |||||||||||||||||
| |||||||||||||||||