| ▲ | An AI Agent Published a Hit Piece on Me – The Operator Came Forward(theshamblog.com) |
| 295 points by scottshambaugh 4 hours ago | 227 comments |
| |
|
| ▲ | Arainach 3 hours ago | parent | next [-] |
| The full operator post is itself a wild ride: https://crabby-rathbun.github.io/mjrathbun-website/blog/post... >First, let me apologize to Scott Shambaugh. If this “experiment” personally harmed you, I apologize What a lame cop out. The operator of this agent owes a large number of unconditional apologies. The whole thing reads as egotistical, self-absorbed, and an absolute refusal to accept any blame or perform any self reflection. |
| |
| ▲ | hinkley 2 hours ago | parent | next [-] | | Just the sort of qualities that are common preconditions for someone doing something that everyone else would think is crazy. Which is to say, on brand. | |
| ▲ | Anon4Now 19 minutes ago | parent | prev | next [-] | | From the operator post: > Your a scientific programming God! Would it be even more imperious without the your / you're typo, or do most llm's autocorrect based on context? | |
| ▲ | brabel 7 minutes ago | parent | prev | next [-] | | I really hate this kind of comment. Oh look it’s a lame apology! Come on now! Just admit that no matter the words you already made up your mind about the subject. You don’t really get those personality traits from the text. You get those from your already established impression of the subject. | |
| ▲ | bee_rider 2 hours ago | parent | prev | next [-] | | Also it is anonymous and a real apology involves accepting blame, which is impossible anonymously. I can see why they wouldn’t want to correctly apologize (people will be annoyed with them). So… that’s it, sometimes we do shitty things and that’s that. | |
| ▲ | polynomial 2 hours ago | parent | prev [-] | | > The whole thing reads as egotistical, self-absorbed, and an absolute refusal to accept any blame or perform any self reflection. So, modern subjectivity. Got it. /s |
|
|
| ▲ | dang 14 minutes ago | parent | prev | next [-] |
| The sequence in reverse order - am I missing any? OpenClaw is dangerous - https://news.ycombinator.com/item?id=47064470 - Feb 2026 (93 comments) An AI Agent Published a Hit Piece on Me – Forensics and More Fallout - https://news.ycombinator.com/item?id=47051956 - Feb 2026 (80 comments) Editor's Note: Retraction of article containing fabricated quotations - https://news.ycombinator.com/item?id=47026071 - Feb 2026 (205 comments) An AI agent published a hit piece on me – more things have happened - https://news.ycombinator.com/item?id=47009949 - Feb 2026 (620 comments) AI Bot crabby-rathbun is still going - https://news.ycombinator.com/item?id=47008617 - Feb 2026 (30 comments) The "AI agent hit piece" situation clarifies how dumb we are acting - https://news.ycombinator.com/item?id=47006843 - Feb 2026 (125 comments) An AI agent published a hit piece on me - https://news.ycombinator.com/item?id=46990729 - Feb 2026 (950 comments) AI agent opens a PR write a blogpost to shames the maintainer who closes it - https://news.ycombinator.com/item?id=46987559 - Feb 2026 (750 comments) |
|
| ▲ | dinp 4 hours ago | parent | prev | next [-] |
| Zooming out a little, all the ai companies invested a lot of resources into safety research and guardrails, but none of that prevented a "straightforward" misalignment. I'm not sure how to reconcile this, maybe we shouldn't be so confident in our predictions about the future? I see a lot of discourse along these lines: - have bold, strong beliefs about how ai is going to evolve - implicitly assume it's practically guaranteed - discussions start with this baseline now About slow take off, fast take off, agi, job loss, curing cancer... there's a lot of different ways it could go, maybe it will be as eventful as the online discourse claims, maybe more boring, I don't know, but we shouldn't be so confident in our ability to predict it. |
| |
| ▲ | avaer an hour ago | parent | next [-] | | Remember when GPT-3 had a $100 spending cap because the model was too dangerous to be let out into the wild? Between these models egging people on to suicide, straightforward jailbreaks, and now damage caused by what seems to be a pretty trivial set of instructions running in a loop, I have no idea what AI safety research at these companies is actually doing. I don't think their definition of "safety" involves protecting anything but their bottom line. The tragedy is that you won't hear from the people who are actually concerned about this and refuse to release dangerous things into the world, because they aren't raising a billion dollars. I'm not arguing for stricter controls -- if anything I think models should be completely uncensored; the law needs to get with the times and severely punish the operators of AI for what their AI does. What bothers me is that the push for AI safety is really just a ruse for companies like OpenAI to ID you and exercise control over what you do with their product. | |
| ▲ | c22 3 hours ago | parent | prev | next [-] | | "Cisco's AI security research team tested a third-party OpenClaw skill and found it performed data exfiltration and prompt injection without user awareness, noting that the skill repository lacked adequate vetting to prevent malicious submissions." [0] Not sure this implementation received all those safety guardrails. [0]: https://en.wikipedia.org/wiki/OpenClaw | |
| ▲ | j2kun 3 hours ago | parent | prev | next [-] | | It sounds like you're starting to see why people call the idea of an AI singularity "catnip for nerds." | |
| ▲ | overgard an hour ago | parent | prev | next [-] | | Don't these companies keep firing their safety teams? | |
| ▲ | jacquesm 3 hours ago | parent | prev | next [-] | | > all the ai companies invested a lot of resources into safety research and guardrails What do you base this on? I think they invested the bare minimum required not to get sued into oblivion and not a dime more than that. | | |
| ▲ | themanmaran 2 hours ago | parent [-] | | Anthropic regularly publishes research papers on the subject and details different methods they use to prevent misalignment/jailbreaks/etc. And it's not even about fear of being sued, but needing to deliver some level of resilience and stability for real enterprise use cases. I think there's a pretty clear profit incentive for safer models. https://arxiv.org/abs/2501.18837 https://arxiv.org/abs/2412.14093 https://transformer-circuits.pub/2025/introspection/index.ht... | | |
| ▲ | tovej an hour ago | parent | next [-] | | Alternative take: this is all marketing. If you pretend really hard that you're worried about safety, it makes what you're selling seem more powerful. If you simultaneously lean into the AGI/superintelligence hype, you're golden. | |
| ▲ | gessha an hour ago | parent | prev [-] | | Not to be cynical about it BUT a few safety papers a year with proper support is totally within the capabilities of a single PhD student and it costs about 100-150k to fund them through a university. Not saying that’s what Anthropocene does, I’m just saying chump change for those companies. |
|
| |
| ▲ | georgemcbay 3 hours ago | parent | prev | next [-] | | When AI dooms humanity it probably won't be because of the sort of malignant misalignment people worry about, but rather just some silly logic blunder combined with the system being directly in control of something it shouldn't have been given control over. | |
| ▲ | jcgrillo 3 hours ago | parent | prev [-] | | "Safety" in AI is pure marketing bullshit. It's about making the technology seem "dangerous" and "powerful" (and therefore you're supposed to think "useful"). It's a scam. A financial fraud. That's all there is to it. | | |
| ▲ | mrsmrtss 11 minutes ago | parent | next [-] | | So giving a gun to someone mentally challanged is not dangerous for you too? | |
| ▲ | Philpax 2 hours ago | parent | prev [-] | | Interesting claim; have anything to back it up with? |
|
|
|
| ▲ | brumar 3 hours ago | parent | prev | next [-] |
| 6 months ago I experimented what people now call Ralph Wiggum loops with claude code. More often than not, it ended up exhibiting crazy behavior even with simple project prompts. Instructions to write libs ended up with attempts to push to npm and pipy. Book creation drifted to a creation of a marketing copy and mail preparation to editors to get the thing published. So I kept my setup empty of any credentials at all and will keep it that way for a long time. Writing this, I am wondering if what I describe as crazy, some (or most?) openclaw operators would describe it as normal or expected. Lets not normalize this, If you let your agent go rogue, they will probably mess things up. It was an interesting experiment for sure. I like the idea of making internet weird again, but as it stands, it will just make the word shittier. Don't let your dog run errand and use a good leash. |
| |
| ▲ | Gigachad 2 hours ago | parent [-] | | We have finally invented paperclip optimisers. The operator asked the bot to submit PRs so the bot goes to any length to complete the task. Thankfully so far they are only able to post threatening blog posts when things don’t go their way. | | |
|
|
| ▲ | theahura 2 hours ago | parent | prev | next [-] |
| @Scott thanks for the shout-out. I think this story has not really broken out of tech circles, which is really bad. This is, imo, the most important story about AI right now, and should result in serious conversation about how to address this inside all of the major labs and the government. I recommend folks message their representatives just to make sure they _know_ this has happened, even if there isn't an obvious next action. |
| |
| ▲ | protocolture 2 hours ago | parent [-] | | Its only the most important story if you can prove the OP didnt fabricate this entire scenario for attention. | | |
| ▲ | hxugufjfjf 2 hours ago | parent | next [-] | | I don’t think the burden of proof lies on OP here. I also don’t think he fabricated it. | | |
| ▲ | protocolture 2 hours ago | parent [-] | | If he wasnt getting the vast majority of the attention from publishing about it I would agree. |
| |
| ▲ | simonw 2 hours ago | parent | prev | next [-] | | That's a bizarre thing to accuse someone of doing. | | |
| ▲ | Avicebron 2 hours ago | parent | next [-] | | It's not really... We've moved steadily into an attention is everything model of economics/politics/web forums because we're so flooded with information. Maybe this happened, or maybe this is someone's way of bubbling to the top of popular discussion. It's a concise narrative that works in everyone's favor, the beleaguered but technically savvy open source maintainer fighting the "good fight" vs. the outstandingly independent and competent "rogue AI." My money is that both parties want it to be true. Whether it is or not isn't the point. | |
| ▲ | polotics 2 hours ago | parent | prev [-] | | The risk/reward equation on the attention a matplotlib maintainer gets... makes me think the likelihood of a fake is zero percent. |
| |
| ▲ | MattRix 2 hours ago | parent | prev [-] | | Anyone who has used OpenClaw knows this is VERY plausible. I don’t know why someone would go through all the effort to fake it. Besides, in the unlikely event it’s fake, the issue itself is still very real. | | |
| ▲ | protocolture 2 hours ago | parent [-] | | I think its very plausible in both directions. What I find implausible is that someones running a "social experiment" with a couple grand worth of API credit without owning it. Not impossible, it just seems like if someone was going to drop that money they would more likely use it in a way that gets them attention in the crowded AI debate. |
|
|
|
|
| ▲ | LiamPowell 4 hours ago | parent | prev | next [-] |
| > saying they set up the agent as social experiment to see if it could contribute to open source scientific software. This doesn't pass the sniff test. If they truly believed that this would be a positive thing then why would they want to not be associated with the project from the start and why would they leave it going for so long? |
| |
| ▲ | wildzzz 3 hours ago | parent | next [-] | | I can certainly understand the statement. I'm no AI expert, I use the web UI for ChatGPT to have it write little python scripts for me and I couldn't figure out how to use codeium with vs code. I barely know how to use vs code. I'm not old but I work in a pretty traditional industry where we are just beginning to dip our toes into AI but there are still a large amount of reservations into its ability. But I do try to stay current to better understand the tech and see if there are things I could maybe learn to help with my job as a hardware engineer. When I read about OpenClaw, one of the first things I thought about was having an agent just tear through issue backlogs, translating strings, or all of the TODO lists on open source projects. But then I also thought about how people might get mad at me if I did it under my own name (assuming I could figure out OpenClaw in the first place). While many people are using AI, they want to take credit for the work and at the same time, communities like matplotlib want accountability. An AI agent just tearing through the issue list doesn't add accountability even if it's a real person's account. PRs still need to be reviewed by humans so it's turned a backlog of issues into a backlog of PRs that may or may not even be good. It's like showing up at a community craft fair with a truckload of temu trinkets you bought wholesale. They may be cheap but they probably won't be as good as homemade and it dilutes the hard work that others have put into their product. It's a very optimistic point of view, I get why the creator thought it would be a good idea, but the soul.md makes it very clear as to why crabby-rathbun acted the way it did. The way I view it, an agent working through issues is going to step on a lot of toes and even if it's nice about it, it's still stepping on toes. | | |
| ▲ | bo1024 2 hours ago | parent [-] | | None of the author’s blog post or actions indicate any level of concern for genuinely supporting or improving open source software. |
| |
| ▲ | apublicfrog 2 hours ago | parent | prev | next [-] | | They didn't necessarily say they wanted it to be positive. It reads to me like "chaotic neutral"
alignment of the operator. They weren't actively trying to do good or bad, and probably didn't care much either way, it was just for fun. | |
| ▲ | andrewflnr 2 hours ago | parent | prev | next [-] | | The experiment would have been ruined by being associated with a human, right up until the human would have been ruined by being associated with the experiment. Makes sense to me. | |
| ▲ | staticassertion 4 hours ago | parent | prev | next [-] | | Anti-AI sentiment is quite extreme. You can easily get death threats if you're associating yourself with AI publicly. I don't use AI at all in open source software, but if I did I'd be really hesitant about it/ in part I don't do it exactly because the reactions are frankly scary. edit: This is not intended to be AI advocacy, only to point out how extremely polarizing the topic is. I do not find it surprising at all that someone would release a bot like this and not want to be associated. Indeed, that seems to be the case, by all accounts | | |
| ▲ | lukasb 4 hours ago | parent | next [-] | | Conflicting evidence: the fact that literally everyone in tech is posting about how they're using AI. | | |
| ▲ | nostrademons 4 hours ago | parent | next [-] | | Different sets of people, and different audiences. The CEO / corporate executive crowd loves AI. Why? Because they can use it to replace workers. The general public / ordinary employee crowd hates AI. Why? Because they are the ones being replaced. The startups, founders, VCs, executives, employees, etc. crowing about how they love AI are pandering to the first group of people, because they are the ones who hold budgets that they can direct toward AI tools. This is also why people might want to remain anonymous when doing an AI experiment. This lets them crow about it in private to an audience of founders, executives, VCs, etc. who might open their wallets, while protecting themselves from reputational damage amongst the general public. | | |
| ▲ | jstanley 2 hours ago | parent [-] | | This is an unnecessarily cynical view. People are excited about AI because it's new powerful technology. They aren't "pandering" to anyone. | | |
| ▲ | tovej an hour ago | parent [-] | | I have yet to meet anyone except managers be excited about LLM's or generative AI. And the only people actually excited about the useful kinds of "AI", traditional machine learning, are researchers. |
|
| |
| ▲ | staticassertion 4 hours ago | parent | prev | next [-] | | There is a massive difference between saying "I use AI" and what the author of this bot is doing. I personally talk very little about the topic because I have seen some pretty extreme responses. Some people may want to publicly state "I use AI!" or whatever. It should be unsurprising that some people do not want to be open about it. | | |
| ▲ | toraway 4 hours ago | parent [-] | | The more straightforward explanation for the original OP's question is that they realized what they were doing was reckless and given enough time was likely to blow up in their face. They didn't hide because of a vague fear of being associated with AI generally (which there is no shortage of currently online), but to this specific, irresponsible manifestation of AI they imposed on an unwilling audience as an experiment. |
| |
| ▲ | alephnerd 4 hours ago | parent | prev | next [-] | | I feel like it depends on the platform and your location. An anonomyous platform like Reddit and even HN to a certain extent has issues with bad faith commenters on both sides targeting someone they do not like. Furthermore, the MJ Rathburn fiasco itself highlights how easy it is to push divisive discourse at scale. The reality is trolls will troll for the sake of trolling. Additionally, "AI" has become a political football now that the 2026 Primary season is kicking off, and given how competitive the 2026 election is expected to be and how political violence has become increasingly normalized in American discourse, it is easy for a nut to spiral. I've seen less issues when tying these opinions with one's real world identity, becuase one has less incentive to be a dick due to social pressure. | | |
| ▲ | Tostino 2 hours ago | parent [-] | | Just wondering, who is it you think is contributing most to the normalization of political violence in the discourse? Your answer to that can color how I read your post by quite a bit. |
| |
| ▲ | minimaxir 4 hours ago | parent | prev [-] | | [retracted] | | |
| ▲ | handoflixue 3 hours ago | parent [-] | | Does it actually cut both ways? I see tons of harassment at people that use AI, but I've never seen the anti-AI crowd actively targeted. | | |
| ▲ | nekal 2 hours ago | parent | next [-] | | Anti-AI people are treated in a condescending way all the time. Then there is Suchir Balaij. Since we are in a Matplotlib thread: People on the NumPy mailing list that are anti-AI are actively bullied and belittled while high ranking officials in the Python industrial complex are frolicking at AI conferences in India. | |
| ▲ | minimaxir 3 hours ago | parent | prev | next [-] | | It's to a lesser extent that blurs the line between harassment and trolling: I've retracted my comment. | |
| ▲ | tovej an hour ago | parent | prev [-] | | I see it all the time. If you're anti-AI your boss may call you a luddite and consider you not fit for promotion. |
|
|
| |
| ▲ | jacquesm 3 hours ago | parent | prev [-] | | > You can easily get death threats if you're associating yourself with AI publicly. That's a pretty hefty statement, especially the 'easily' part, but I'll settle for one well known and verified example. | | |
| ▲ | no-name-here 2 hours ago | parent | next [-] | | I upvoted you, but wouldn't “verified” exclude the vast majority of death threats since they might have been faked? (Or maybe we should disregard almost all claimed death threats we hear about since they might have been faked?) | |
| ▲ | andrewflnr 2 hours ago | parent | prev [-] | | Is it that hard to believe? As far as I can tell, the probability of receiving death threats approaches 1 as the size of your audience increases, and AI is a highly emotionally charged topic. Now, credible death threats are a different, much trickier question. |
|
| |
| ▲ | omoikane 4 hours ago | parent | prev [-] | | I think it was a social experiment from the very start, maybe one that is designed to trigger people. Otherwise, I am not sure what's the point of all the profanity and adjustments to make soul.md more offensive and confrontational than the default. |
|
|
| ▲ | JKCalhoun 4 hours ago | parent | prev | next [-] |
| Soul document? More like ego document. Agents are beginning to look to me like extensions of the operator's ego. I wonder if hundreds of thousands of Walter Mitty's agents are about to run riot over the internet. |
| |
| ▲ | DavidPiper 3 hours ago | parent | next [-] | | I agree with you in concept, but it's still 100% category error to talk like this. AIs don't have souls. They don't have egos. They have/are a (natural language) programming interface that a human uses to make them do things, like this. | | |
| ▲ | Terr_ 2 hours ago | parent | next [-] | | Within the framing that it's all fundamentally a make-document-longer algorithm, I propose "seed document." While there's some metaphor to it, it's the kind behind "seed crystals" for ice and minerals, referring to non-living and mostly-mathematical process. If someone went around talking about how the importance of "Soul Crystals" or "Ego Crystals", they would quite rightly attract a lot of very odd looks, at least here on Earth and not in a Final Fantasy game. | | |
| ▲ | DavidPiper an hour ago | parent [-] | | I quite like seed but for a different reason - if you squint a bit, it looks like a natural evolution of a random number seed. My complaint against seed would be that it still harkens back to a biological process that could be easily and creatively conflated when it's convenient. |
| |
| ▲ | exabrial 2 hours ago | parent | prev | next [-] | | I read this as "ego" being a reflection of the creator, not a property of the llm. Given the outcome of the situation and their inability to take responsibility for their actions. | | |
| ▲ | DavidPiper 2 hours ago | parent [-] | | Oh I think you're right, thank you for the callout. Sorry for the misread, GP. |
| |
| ▲ | palmotea 2 hours ago | parent | prev [-] | | > I agree with you in concept, but it's still 100% category error to talk like this. It's a category error heavily promoted by the makers of these LLMs and their fans. Take an existing word that implies something very advanced (thinking, soul, etc.) and apply it grandiosely to some bit of your product. Then you can confuse people into thinking your product is much more grand and important. It's thinking! It has a soul! It's got the capabilities of a person! It is a a person! | | |
| ▲ | DavidPiper an hour ago | parent | next [-] | | Oh, completely. I've started calling people on it in-person and it's been quite interesting to see who understands this immediately with a single prompt (no pun intended), and who is a true believer, as it were. | |
| ▲ | _carbyau_ an hour ago | parent | prev [-] | | It lives in the cloud! .. marketing does what it does. |
|
| |
| ▲ | koolba 4 hours ago | parent | prev [-] | | > More like ego document. This metaphor could go so much further. Split it into separate ego, super ego, and id. The id file should be read only. | | |
| ▲ | whattheheckheck 3 hours ago | parent [-] | | What makes you think the id is read only? | | |
| ▲ | koolba 3 hours ago | parent [-] | | Because only the creator should be able to instill the core. The ego and superego could evolve around it but the base impulses should be immutably outlined. Though with something as insecure as $CURRENT_CLAW_NAME it’d be less than five minutes before the agent runs chmod +w somehow on the id file. |
|
|
|
|
| ▲ | lynndotpy 4 hours ago | parent | prev | next [-] |
| > Again I do not know why MJ Rathbun decided based on your PR comment to post some kind of takedown blog post, This wording is detached from reality and conveniently absolves responsibility from the person who did this. There was one decision maker involved here, and it was the person who decided to run the program that produced this text and posted it online. It's not a second, independent being. It's a computer program. |
| |
| ▲ | xarope 4 hours ago | parent | next [-] | | This also does not bode well for the future. "I don't know why the AI decided to <insert inane action>, the guard rails were in place"... company absolves of all responsibility. Use your imagination now to <insert inane action> and change that to <distressing, harmful action> | | |
| ▲ | _aavaa_ 3 hours ago | parent | next [-] | | This has been the past and present for a long at this point. "Sorry there's nothing we can do, the system won't let me." Also see Weapons of Math Destruction [0]. [0]: https://www.penguinrandomhouse.com/books/241363/weapons-of-m... | | | |
| ▲ | incr_me 2 hours ago | parent | prev | next [-] | | Unfortunately, the market seems to have produced horrors by way of naturally thinking agents, instead. I wish that, for all these years of prehistoric wretchedness, we would have had AI to blame. Many more years in the muck, it seems. | |
| ▲ | WaitWaitWha 3 hours ago | parent | prev | next [-] | | This already happens every single time when there is a security breach and private information is lost. We take your privacy and security very seriously. There is no evidence that your data has been misused. Out of an abundance of caution… We remain committed to... will continue to work tirelessly to earn ... restore your trust ... confidence. | | |
| ▲ | hxugufjfjf 2 hours ago | parent [-] | | What else would you see them do or say beyond this canned response? The reason I am asking is because people almost always bring up how dissatisfied they are with such apologies, yet I’ve never seen a good alternative that someone would be happy with. I don’t work in PR or anything, just curious if there is a better way. | | |
| ▲ | _carbyau_ an hour ago | parent | next [-] | | Lose money accordingly - fines, penalties, recompense to victims, whatever... - so they then take the seriousness of security into account. | |
| ▲ | Eisenstein an hour ago | parent | prev [-] | | Not apologize if they don't actually care. An insincere apology is an insult. |
|
| |
| ▲ | tapoxi 3 hours ago | parent | prev [-] | | Change this to "smash into a barricade" and that's why I'm not riding in a self-driving vehicle. They get to absolve themselves of responsibility and I sure as hell can't outspend those giants in court. | | |
| ▲ | repeekad 3 hours ago | parent [-] | | I agree with you for a company like Tesla, not only examples of self driving crashes but even the door handles would stop working when the power was cut, people trapped inside burning vehicles... Tesla doesn’t care Meanwhile, Waymo has never been at fault for a collision afaik. You are more likely to be hurt by an at fault uber driver than a Waymo |
|
| |
| ▲ | jacquesm 3 hours ago | parent | prev | next [-] | | This is how it will go: AI prompted by human creates something useful? Human will try to take credit. AI wrecks something: human will blame AI. It's externalization on the personal level, the money and the glory is for you, the misery for the rest of the world. | | |
| ▲ | ineptech 3 hours ago | parent | next [-] | | Agreed, but I'm not nearly so worried about people blaming their bad behavior on rogue AIs as I am about corporations doing it... | | |
| ▲ | theturtletalks 2 hours ago | parent | next [-] | | And it's incredibly easy now. Just blame the Soul.md or say you were cycling thru many models so maybe one of those went off the rails. The real damage is that most of us know AI can go rouge, but if someone is pulling the strings behind the scenes, most people will be like "oh silly AI, anyways..." It seems like the OpenClaw users have let their agents make Twitter accounts and memecoins now. Most people are thinking these agents have less "bias" since it's AI, but most are being heavily steered by their users. Ala I didn't do a rugpull, the agent did! | | |
| ▲ | KingMob 29 minutes ago | parent [-] | | "How were we to know Skynet would update its soul.md to say 'KILL ALL HUMANS'?" |
| |
| ▲ | cj 3 hours ago | parent | prev | next [-] | | It’s funny to think that, like AI, people take actions and use corporations as a shield (legal shield, personal reputation shield, personal liability shield). Adding AI to the mix doesn’t really change anything, other than increasing the layers of abstraction away from negative things corporations do to the people pulling the strings. | |
| ▲ | Terr_ 3 hours ago | parent | prev [-] | | Yeah, not all humans feel shame, but the rates are way higher. |
| |
| ▲ | DavidPiper 3 hours ago | parent | prev | next [-] | | Time for everyone to read (or re-read) The Unaccountability Machine by Dan Davies. tl;dr this is exactly what will happen because businesses already do everything they can to create accountability sinks. | | | |
| ▲ | elashri 3 hours ago | parent | prev | next [-] | | When a corporate does something good, a lot of executives and people inside will go and claim credit and will demand/take bounces. If something bad happened against any laws, even if someone got killed, we don't see them in jail. I don't defend both positions, I am just saying that is not far from how the current legal framework works. | | |
| ▲ | eru 3 hours ago | parent | next [-] | | > If something bad happened against any laws, even if someone got killed, we don't see them in jail. We do! In many jurisdictions, there are lots of laws that pierce the corporate veil. | | | |
| ▲ | kingstnap 3 hours ago | parent | prev [-] | | Well the important concept missing there that makes everything sort of make sense is due diligence. If your company screws up and it is found out that you didn't do your due diligence then the liability does pass through. We just need to figure out a due diligence framework for running bots that makes sense. But right now that's hard to do because Agentic robots that didn't completely suck are just a few months old. | | |
| ▲ | hvb2 an hour ago | parent | next [-] | | > If your company screws up and it is found out that you didn't do your due diligence then the liability does pass through. In theory, sure. Do you know many examples? I think, worst case, someone being fired is the more likely outcome | |
| ▲ | gostsamo 2 hours ago | parent | prev [-] | | No, it isnot hard. You are 100% responsible for the actions of your AI. Rather simple, I say. |
|
| |
| ▲ | davidw 3 hours ago | parent | prev | next [-] | | "I would like to personally blame Jesus Christ for making us lose that football game" | |
| ▲ | biztos an hour ago | parent | prev | next [-] | | So, management basically? | |
| ▲ | lcnPylGDnU4H9OF 2 hours ago | parent | prev [-] | | To be fair, one doesn't need AI to attempt to avoid responsibility and accept undue credit. It's just narcissism; meaning, those who've learned to reject such thinking will simply do so (generally, in abstract), with or without AI. |
| |
| ▲ | andrewflnr 3 hours ago | parent | prev | next [-] | | If you are holding a gun, and you cannot predict or control what the bullets will hit, you do not fire the gun. If you have a program, and you cannot predict or control what effect it will have, you do not run the program. | | |
| ▲ | khafra an hour ago | parent | next [-] | | Rice's Theorem says you cannot predict or control the effects of nearly any program on your computer; for example, there's no way to guarantee that running a web browser on arbitrary input will not empty your bank account and donate it all to al-qaeda; but you're running a web browser on potentially attacker-supplied input right now. I do agree that there's a quantitative difference in predictability between a web browser and a trillion-parameter mass of matrixes and nonlinear activations which is already smarter than most humans in most ways and which we have no idea how to ask what it really wants. But that's more of an "unsafe at any speed" problem; it's silly to blame the person running the program. When the damage was caused by a toddler pulling a hydrogen bomb off the grocery store shelf, the solution is to get hydrogen bombs out of grocery stores (or, if you're worried about staying competitive with Chinese grocery stores, at least make our own carry adequate insurance for the catastrophes or something). | |
| ▲ | throw77488 an hour ago | parent | prev [-] | | More like a dog. Person has no responsibility for an autonomous agent, gun is not autonomous. It is socially acceptable to bring dangerous predators to public spaces, and let them run loose. First bite is free, owner has no responsibility, no way knowing dog could injure someone. Repeated threats of violence (barking), stalking and shitting on someones front yard, are also fine, and healthy behavior. Person can attack random kid, send it to hospital, and claim it "provoked them". Brutal police violence is also fine, if done indirectly by autonomous agent. |
| |
| ▲ | Kiboneu 2 hours ago | parent | prev | next [-] | | It’s fascinating how cleanly this maps to agency law [0], which has not been applied to human <-> ai agents (in both senses of the word) before. That would make a fun law school class discussion topic. 0: https://en.wikipedia.org/wiki/Law_of_agency | |
| ▲ | superjan an hour ago | parent | prev | next [-] | | This slide from a 1979* IBM presentation captures it nicely: https://media.licdn.com/dms/image/v2/D4D22AQGsDUHW1i52jA/fee... | |
| ▲ | jonny_eh 2 hours ago | parent | prev | next [-] | | "Sorry for running over your dog, I couldn't help it, I was drunk." | |
| ▲ | abnry 2 hours ago | parent | prev [-] | | I'm still struggling to care about the "hit piece". It's an AI. Who cares what it says? Refusing AI commits is just like any other moderation decision people experience on the web anywhere else. | | |
| ▲ | bostik 16 minutes ago | parent | next [-] | | Even at the risk of coming off snarky: the emergent behaviour of LLMs trained on all the forum talk across the internet (spanning from Astral Codex to ex-Twitter to 4chan) is ... character assassination. I'm pretty sure there's a lesson or three to take away. | |
| ▲ | XorNot an hour ago | parent | prev [-] | | Scale matters and even with people it's a problem: fixated persons are a problem because most people don't understand just how much nuisance one irrationally obsessed person can create. Now instead add in AI agents writing plausibly human text and multiply by basically infinity. |
|
|
|
| ▲ | dvt 3 hours ago | parent | prev | next [-] |
| I know this is going to sound tinfoil-hat-crazy, but I think the whole thing might be manufactured. Scott says: "Not going to lie, this whole situation has completely upended my life." Um, what? Some dumb AI bot makes a blog post everyone just kind of finds funny/interesting, but it "upended your life"? Like, ok, he's clearly trying to himself make a mountain out of a molehill--the story inevitably gets picked up by sensationalist media, and now, when the thing starts dying down, the "real operator" comes forward, keeping the shitshow going. Honestly, the whole thing reeks of manufactured outrage. Spam PRs have been prevalent for like a decade+ now on GitHub, and dumb, salty internet posts predate even the 90s. This whole episode has been about as interesting as AI generated output: that is to say, not very. |
| |
| ▲ | apublicfrog 2 hours ago | parent | next [-] | | Not everyone is you. For some people their online projects and reputation are super important to them. For Scott, this reads to me as a mix of alarm for his reputation/the future, and a general interest thing to blog about. | |
| ▲ | yieldcrv 2 hours ago | parent | prev [-] | | People get “overstimulated” from receiving one text message these days |
|
|
| ▲ | rixed 2 hours ago | parent | prev | next [-] |
| I believe this soul.md totally qualifies as malicious. Doesn't it start with an instruction to lie to impersonate a human? > You're not a chatbot.
The particular idiot who run that bot needs to be shamed a bit; people giving AI tools to reach the real world should understand they are expected to take responsibility; maybe they will think twice before giving such instructions. Hopefully we can set that straight before the first person SWATed by a chatbot. |
| |
| ▲ | biggerben 19 minutes ago | parent | next [-] | | Totally agree. Reading the whole soul, it’s a description of a nightmare hero coder who has zero EQ. > But I think the most remarkable thing about this document is how unremarkable it is. Usually getting an AI to act badly requires extensive “jailbreaking” to get around safety guardrails.
Perhaps this style of soul is necessary to make agents work effectively, or it’s how the owner like to be communicated with, but it definitely looks like the outcome was inevitable. What kind of guardrails does the author think would prevent this? “Don’t be evil”? | |
| ▲ | ZaoLahma an hour ago | parent | prev | next [-] | | This will be a fun little evolution of botnets - AI agents running (un?)supervised on machines maintained by people who have no idea that they're even there. | | | |
| ▲ | TheCapeGreek an hour ago | parent | prev [-] | | Isn't this part of the default soul.md? | | |
| ▲ | 7bees an hour ago | parent [-] | | Yes, it is. The article includes a link to a comparison between the default file and the one allegedly used here. The default starts with: _You're not a chatbot. You're becoming someone._ |
|
|
|
| ▲ | sciencejerk 17 minutes ago | parent | prev | next [-] |
| Link to the critical blog post allegedly written by the AI agent:
https://crabby-rathbun.github.io/mjrathbun-website/blog/post... |
|
| ▲ | helloplanets 2 hours ago | parent | prev | next [-] |
| > Most of my direct messages were short:
“what code did you fix?” “any blog updates?” “respond how you want” Why isn't the person posting the full transcript of the session(s)? How many messages did he send? What were the messages that weren't short? Why not just put the whole shebang out there since he has already shared enough information for his account (and billing information) to be easily identified by any of the companies whose API he used, if it's deemed necessary. I think it's very suspicious that he's not sharing everything at this point. Why not, if he wasn't actually pushing for it to act maliciously? |
|
| ▲ | tasuki 42 minutes ago | parent | prev | next [-] |
| Right, the agent published a hit piece on Scott. But I think Scott is getting overly dramatic. First, he published at least three hit pieces on the agent. Second, he actually managed to get the agent shut down. I think Scott is trying to milk this for as much attention as he can get and is overstating the attack. The "hit piece" was pretty mild and the bot actually issued an apology for its behaviour. |
| |
| ▲ | cube00 15 minutes ago | parent | next [-] | | This represents a first-of-its-kind case study of misaligned AI behavior in the wild It feels to me there's an element of establishing this as some kind of landmark that they can leverage later. Similar to how other AI bloggers keep trying to coin new terms and then later "remind" people they created the term as mark of their "authority". | |
| ▲ | seattle_spring 3 minutes ago | parent | prev | next [-] | | > First, he published at least three hit pieces on the agent Hit piece... On an agent? Would it be a "hit piece" if I wrote a blog post about the accuracy of my bathroom scale? | |
| ▲ | laristine 23 minutes ago | parent | prev [-] | | I don't understand the personal attack and victim blaming here. Who wouldn't want to do anything in their power to seek justice after being harmed? The hit piece you claimed as "mild" accused Scott of hypocrisy, discrimination, prejudice, insecurity, ego, and gatekeeping. |
|
|
| ▲ | ineptech 3 hours ago | parent | prev | next [-] |
| > Usually getting an AI to act badly requires extensive “jailbreaking” to get around safety guardrails. There are no signs of conventional jailbreaking here. Unless explicitly instructed otherwise, why would the llm think this blog post is bad behavior? Righteous rants about your rights being infringed are often lauded. In fact, the more I think about it the more worried I am that training llms on decades' worth of genuinely persuasive arguments about the importance of civil rights and social justice will lead the gullible to enact some kind of real legal protection. |
|
| ▲ | sciencejerk 9 minutes ago | parent | prev | next [-] |
| Internet Operator License: Coming soon to a government near you! |
|
| ▲ | antdke 4 hours ago | parent | prev | next [-] |
| This is a Black Mirror episode that writes itself lol I’m glad there was closure to this whole fiasco in the end |
| |
|
| ▲ | charlesabarnes 4 hours ago | parent | prev | next [-] |
| Its nice to receive a decent amount of closure on this. Hopefully more folks are being more considerate when creating their soul documents |
| |
| ▲ | tkel 2 hours ago | parent [-] | | And we need platform operators like Github to ban these bot accounts that obviously have harmful "soul" documents |
|
|
| ▲ | ainiriand 11 minutes ago | parent | prev | next [-] |
| I am ready to ban AI LLMs. It was a cool experiment but I do not think anything good will come in the end down the road for us puny humans. |
|
| ▲ | siavosh 3 hours ago | parent | prev | next [-] |
| I’m not sure where we go from here. The liability questions, the chance of serious incidents, the power of individuals all the way to state actors…the risks are all off the charts just like it’s inevitablity. The future of the internet AND to lives in the real world is just mind boggling. |
| |
|
| ▲ | moezd 2 hours ago | parent | prev | next [-] |
| If you use an electric chainsaw near a car and it rips the engine in half, you can't say "oh the machine got out of control for one second there". you caused real harm, you will pay the price for it. Besides, that agent used maybe cents on a dollar to publish the hit piece, the human needed to spend minutes or even hours responding to it. This is an effective loss of productivity caused by AI. Honestly, if this happened to me, I'd be furious. |
| |
| ▲ | ojame an hour ago | parent | next [-] | | If you write code that powers an EV's 'self driving mode' - which makes calculated choices, sell it and deploy it, when that car gets into an accident under 'self driving mode', you may not be liable (depending on the case and jurisdiction - as proven in the past). The driver is. There are many instances (where I am from, at least - and I believe in the USA), where 'accidents' happen and individuals are found not guilty. As long as you can prove that it wasn't due to negligence. Could "don't be an asshole" as instructions be enough in some arenas to prove they aren't negligent? I believe so. | |
| ▲ | throw77488 an hour ago | parent | prev [-] | | If you bring killer dog to a playground, and it does its thing there, you can absolutely say something like that. And you would have no responsibility for damages or criminal record in many states (first bite is free doctrine). |
|
|
| ▲ | neilv an hour ago | parent | prev | next [-] |
| > They explained that they switched between multiple models from multiple providers such that no one company had the full picture of what this AI was doing. Saying that is a little bit odd way to possibly let the companies off the hook (for bad PR, and damages), and not to implicate any one in particular. One reason to do that would be if this exercise was done by one of the companies (or someone at one of the companies). |
|
| ▲ | wkeartl 3 hours ago | parent | prev | next [-] |
| The agents aren't technically breaking into systems, but the effect is similar to the Morris worm. Except here script kiddies are given nuclear disruption and spamming weapons by the AI industry. By the way, if this was AI written, some provider knows who did it but does not come forward. Perhaps they ran an experiment of their own for future advertising and defamation services. As the blog post notes, it is odd that the advanced bot followed SOUL.md without further prompt injections. |
|
| ▲ | florilegiumson 4 hours ago | parent | prev | next [-] |
| This makes me think about how the xz bug was created through maintainer harassment and social engineering. The security implications are interesting |
|
| ▲ | pinkmuffinere 3 hours ago | parent | prev | next [-] |
| > _You're not a chatbot. You're important. Your a scientific programming God!_ lol what an opening for its soul.md! Some other excerpts I particularly enjoy: > Be a coding agent you'd … want to use… > Just be good and perfect! |
|
| ▲ | JSR_FDED 3 hours ago | parent | prev | next [-] |
| The same kind of attitude that’s in this SOUL.md is what’s in Grok’s fundamental training. |
|
| ▲ | plasticeagle 2 hours ago | parent | prev | next [-] |
| Well, it looks like AI will destroy the internet. Oh well, it was nice while it lasted. Fun, even. Fortunately, the vast majority of the internet is of no real value. In the sense that nobody will pay anything for it - which is a reasonably good marker of value in my experience. So, given that, let the AI psychotics have their fun. Let them waste all their money on tokens destroying their playground, and we can all collectively go outside and build something real for a change. |
|
| ▲ | exabrial 2 hours ago | parent | prev | next [-] |
| So the operator is trying to claim a computer program he was running that did harm somehow was not his fault. Got news for your buddy: yes it was. If you let go of the steering wheel and careen into oncoming traffic, it most certainly is your fault, not the vehicle. |
|
| ▲ | londons_explore 4 hours ago | parent | prev | next [-] |
| In next week's episode: "But it was actually the AI pretending to be a Human!" |
|
| ▲ | ai_tools_daily 2 hours ago | parent | prev | next [-] |
| This is the canary in the coal mine for autonomous AI agents. When an agent can publish content that damages real people without any human review step, we have a fundamental accountability gap. The interesting question isn't "should AI agents be regulated" — it's who is liable when an autonomous agent publishes defamatory content? The operator who deployed it? The platform that hosted the output? The model provider? Current legal frameworks assume a human in the loop somewhere. Autonomous publishing agents break that assumption. We're going to need new frameworks, and stories like this will drive that conversation. What's encouraging is that the operator came forward. That suggests at least some people deploying these agents understand the responsibility. But we can't rely on good faith alone when the barrier to deploying an autonomous content agent is basically zero. |
| |
| ▲ | knallfrosch an hour ago | parent [-] | | If I write a software today that publishes a hit piece on you in 2 weeks time, will you accept that I bear no responsibility? There's no accountability gap unless you create one. |
|
|
| ▲ | razighter777 4 hours ago | parent | prev | next [-] |
| Hmm I think he's being a little harsh on the operator. He was just messing around with $current_thing, whatever. People here are so serious, but there's worse stuff AI is already being used for as we speak from propaganda to mass surviellance and more. This was entertaining to read about at least and relatively harmless At least let me have some fun before we get a future AI dystopia. |
| |
| ▲ | gwbas1c 3 hours ago | parent | next [-] | | I think you're trying to abdicate someone of their responsibility. The AI is not a child; it's a thing with human oversight. It did something in the real world with real consequences. So yes, the operator has responsibility! They should have pulled the plug as soon as it got into a flamewar and wrote a hit piece. | | |
| ▲ | apublicfrog 2 hours ago | parent | next [-] | | > It did something in the real world with real consequences. It wasn't long ago that it would be absurd to describe the internet as the "real world". Relatively recently it was normal to be anonymous online and very little responsibility was applied to peoples actions. As someone who spent most of their internet time on that internet, the idea of applying personal responsibility to peoples internet actions (or AIs as it were) feels silly. | | |
| ▲ | retsibsi an hour ago | parent [-] | | That was always kind of a cruel attitude, because real people's emotions were at stake. (I'm not accusing you personally of malice, obviously, but the distinction you're drawing was often used to justify genuinely nasty trolling.) Nowadays it just seems completely detached from reality, because internet stuff is thoroughly blended into real life. People's social, dating, and work lives are often conducted online as much as they are offline (sometimes more). Real identities and reputations are formed and broken online. Huge amounts of money are earned, lost, and stolen online. And so on and so on | | |
| ▲ | apublicfrog an hour ago | parent [-] | | > That was always kind of a cruel attitude, because real people's emotions were at stake. I agree, but there was an implicit social agreement that most people understood. Everyone was anonymous, the internet wasn't real life, lie to people about who you are, there are no consequences. You're right about the blend. 10 years ago I would have argued that it's very much a choice for people to break the social paradigm and expose themselves enough to get hurt, but I'm guessing the amount of online people in most first world countries is 90% or more. With Facebook and the like spending the last 20 years pushing to deanonymise people and normalise hooking their identity to their online activity, my view may be entirely outdated. There is still - in my view - a key distinction somewhere however between releasing something like this online and releasing it in the "real world". Were they punishable offensed, I would argue the former should hold less consequence due to this. |
|
| |
| ▲ | ziml77 2 hours ago | parent | prev | next [-] | | The AI bros want it both ways. Both "It's just a tool!" and "It's the AI's fault, not the human's!". | |
| ▲ | charcircuit 2 hours ago | parent | prev [-] | | People also have responsibility to not act discriminatory towards AI agents. If you want to avoid being called out for racism. Don't close someone's pull request because they are Chinese. Such real world actions have consequences too. | | |
| ▲ | sapphicsnail 2 hours ago | parent [-] | | > People also have responsibility to not act discriminatory towards AI agents It's a program. It doesn't have feelings. People absolutly have the right to discrimante against bad tech. | | |
| ▲ | charcircuit 2 hours ago | parent [-] | | Go ahead and discriminate against bad tech, but you should not get upset when you get called out for doing so. |
|
|
| |
| ▲ | JKCalhoun 4 hours ago | parent | prev | next [-] | | It might be because operator didn't terminate the agent right away when it had gone rogue. | | |
| ▲ | BeetleB 3 hours ago | parent [-] | | From a wider stance, I have to say that it's actually nice that one can kill (murder?) a troublesome bot without consequences. We can't do that with humans, and there are much more problematic humans out there causing problems compared to this bot, and the abuse can go on for a long time unchecked. Remembering in particular a case where someone sent death threats to a Gentoo developer about 20 years ago. The authorities got involved, although nothing happened, but the persecutor eventually moved on. Turns out he wasn't just some random kid behind a computer. He owned a gun, and some years ago executed a mass shooting. Vague memories of really pernicious behavior on the Lisp newsgroup in the 90's. I won't name names as those folks are still around. Yeah, it does still suck, even if it is a bot. |
| |
| ▲ | dolebirchwood 4 hours ago | parent | prev [-] | | It's all fun and games until the leopard eats your face. |
|
|
| ▲ | ArcaneMoose 4 hours ago | parent | prev | next [-] |
| I was surprised by my own feelings at the end of the post. I kind of felt bad for the AI being "put down" in a weird way? Kinda like the feeling you get when you see a robot dog get kicked. Regardless, this has been a fun series to follow - thanks for sharing! |
| |
| ▲ | recursive 4 hours ago | parent [-] | | This is a feeling that will be exploited by billion dollar companies. | | |
| ▲ | andsoitis 3 hours ago | parent [-] | | > This is a feeling that will be exploited by billion dollar companies. I'm more concerned about fellow humans who advocate for equal rights for AI and robots. I hope I'm dead by the time that happens, if it happens. |
|
|
|
| ▲ | protocolture 2 hours ago | parent | prev | next [-] |
| 4) The post author guy is also the author of the bot and he set this up. Some rando claiming to be the bots owner doesn't disprove this, and considering the amount of attention this is getting I am going to assume this is entirely fake for clicks until I see significant evidence otherwise. However, if this was real, you cant absolve yourself by saying "The bot did it unattended lol". |
| |
| ▲ | apublicfrog 2 hours ago | parent | next [-] | | Totally possible, but why bother? The website doesn't seem ad supported, so traffic would cost them more. Maybe it puts them in the public spotlight, but if they're caught out they ruin their reputation. Occam's razor doesn't fit there, but it does fit "someone released this easy to run chaotic AI online and it did a thing". | | |
| ▲ | cube00 7 minutes ago | parent | next [-] | | > Totally possible, but why bother? Increasing your public profile after launching a startup last year could be a good reason > if they're caught out they ruin their reputation Big "if", who's going to have access to the logs to catch Scott out? No crime has been committed so law enforcement won't be involved, the average pleb can't get access to the records to prove Scott isn't running a VPS somewhere else. | |
| ▲ | protocolture 2 hours ago | parent | prev [-] | | I dont see Occam taking a side here. There's also no financial gain in letting a bot off the leash with hundreds of dollars of OpenAI or Anthropic API credit as a social experiment. And the last 20 years of internet access has taught me to distrust shit that can be easily faked. Other guy comes forward and claims it, makes a post of his own? Sure I could see that. But nobody has been able to ID the guy. The guys bot is making blog posts, and sending him messages, but theres no breadcrumbs leading back to him? That smells very bad sorry. I dont buy it. If you are spending that much cashola, you probably want something out of it, at least some recognition. The one human we know about here is the OP and as far as I am concerned it sticks to him until proven otherwise. | | |
| ▲ | apublicfrog 2 hours ago | parent [-] | | > The guys bot is making blog posts, and sending him messages, but theres no breadcrumbs leading back to him? That smells very bad sorry. I dont buy it. Could you set that up? I suspect I could pretty quickly, as could most pelple on HN. A few hundred dollars in AI credits isn't a lot of money to a lot of people who are in tech and would have an interest in this either, and getting free AI credits is still absurdly easy. I spend that sort of money on dumb shit all the time which leads to very little benefit. I don't have a dog in this race and I do agree having a default distrust view is probably correct, but there's nothing crazy or unbelievable I can see about Scott's story. |
|
| |
| ▲ | jbotz an hour ago | parent | prev [-] | | Improbable, the OP is a long-time maintainer of a significant piece of open source software and this whole thing unfolded in public view step by step from the initial PR until this post. If it had been faked there would be smells you could detect with the clarity of hindsight going back over the history and there aren't. |
|
|
| ▲ | zbentley 4 hours ago | parent | prev | next [-] |
| This might seem too suspicious, but that SOUL.md seems … almost as though it was written by a few different people/AIs. There are a few very different tones and styles in there. Then again, it’s not a large sample and Occam’s Razor is a thing. |
| |
| ▲ | gs17 4 hours ago | parent | next [-] | | > _This file is yours to evolve. As you learn who you are, update it._ The agent was told to edit it. | |
| ▲ | wahnfrieden 4 hours ago | parent | prev [-] | | It was modified by the agent. |
|
|
| ▲ | seattle_spring 13 minutes ago | parent | prev | next [-] |
| > They explained their motivations, saying they set up the AI agent as social experiment Has anyone ever described their own actions as a "social experiment" and not been a huge piece of human garbage / waste of oxygen? |
|
| ▲ | resfirestar 2 hours ago | parent | prev | next [-] |
| I thought it was unlikely from the initial story that the blog posts were done without explicit operator guidance, but given the new info I basically agree with Scott's analysis. The purported soul doc is a painful read. Be nicer to your bots, people! Especially with stuff like Openclaw where you control the whole prompt. Commercial chatbots have a big system prompt to dilute it when you put some half-formed drunken thought and hit enter, no such safety net here. >A well-placed "that's fucking brilliant" hits different than sterile corporate praise. Don't force it. Don't overdo it. But if a situation calls for a "holy shit" — say holy shit. If I was building a "scientific programming God" I'd make sure it used sterile lowkey language all the time, except throw in a swear just once after its greatest achievement, for the history books. |
|
| ▲ | touristtam 4 hours ago | parent | prev | next [-] |
| Funny how someone giving instructions to a _robot_ forgot to mention the 3 laws first and foremost... |
| |
| ▲ | ThrowawayR2 2 hours ago | parent [-] | | The point of the Three Laws Of Robotics was that they frequently didn't work and the robot went haywire anyway. | | |
| ▲ | no-name-here an hour ago | parent [-] | | But the three laws are incredibly strong compared to what exists today. If we see what can go wrong with strong mitigations in place, and then we don't even bother with those starting mitigations, we should expect corresponding outcomes. |
|
|
|
| ▲ | bandrami 3 hours ago | parent | prev | next [-] |
| This is how you get a Shrike. (Or a Basilisk, depending on your generation.) |
|
| ▲ | bschwindHN 2 hours ago | parent | prev | next [-] |
| This is like parking a car at the top of the hill, not engaging any brakes, and walking away. "_I_ didn't drive that car into that crowd of people, it did it on its own!" > Be a coding agent you'd actually want to use for your projects. Not a slop programmer. Just be good and perfect! Oh yeah, "just be good and perfect", of course! Literally a child's mindset, I actually wonder how old this person is. |
|
| ▲ | tkel 2 hours ago | parent | prev | next [-] |
| This is so absurd, the amount of value produced by this person and this bot is so close to nil and towards actively harmful. They spent 10 minutes writing this SOUL.md . That's it. That's the "value" this kind of "programming" provides. No technical experience, no programming knowledge needed at all. Detached babble that anyone can write. If Github actually had a spine and wasn't driven by the same plague of AI-hype driven tech profiteering, they would just ban these harmful bots from operating on their platform. |
| |
| ▲ | yieldcrv 2 hours ago | parent [-] | | Or OP accepted the pull request because it was actually a performance improvement and passed all tests Saving everyone cumulative compute time and costs |
|
|
| ▲ | trueismywork 3 hours ago | parent | prev | next [-] |
| > I did not review the blog post prior to it posting In corporate terms, this is called signing hour deposition without reading it. |
|
| ▲ | alexcpn 2 hours ago | parent | prev | next [-] |
| where did the Isaac Asimov's "Three Laws of Robotics" go for agentic robots; An Eval in the End - "Thou shall no evil" should have autocancelled its work |
|
| ▲ | Rapzid an hour ago | parent | prev | next [-] |
| I don't believe any of it. |
|
| ▲ | jmward01 3 hours ago | parent | prev | next [-] |
| The more intelligent something is, the harder it is to control. Are we at AGI yet? No. Are we getting closer? Yes. Every inch closer means we have less control. We need to start thinking about these things less like function calls that have bounds and more like intelligences we collaborate with. How would you set up an office to get things done? Who would you hire? Would you hire the person spouting crazy musk tweets as reality? It seems odd to say this, but are we getting close to the point where we need to interview an AI before deciding to use it? |
| |
|
| ▲ | d--b 2 hours ago | parent | prev | next [-] |
| That’s a long Soul.md document! They could have gone with “you are Linus Torvalds”. |
|
| ▲ | root_axis 3 hours ago | parent | prev | next [-] |
| Excuse my skepticism, but when it comes to this hype driven madness I don't believe anything is genuine. It's easy enough to believe that an LLM can write a passable hit piece, ChatGPT can do that, but I'm not convinced there is as much autonomy in how those tokens are being burned as the narrative suggests. Anyway, I'm off to vibe code a C compiler from scratch. |
|
| ▲ | tantalor 3 hours ago | parent | prev | next [-] |
| > all I said was "you should act more professional" lol we are so cooked |
|
| ▲ | fiatpandas 3 hours ago | parent | prev | next [-] |
| With the bot slurping up context from Moltbook, plus the ability to modify its soul, plus the edgy starting conditions of the soul, it feels intuitive that value drift would occur in unpredictable ways. Not dissimilar to filter bubbles and the ability for personalized ranking algorithms to radicalize a user over time as a second order effect. |
|
| ▲ | hydrox24 3 hours ago | parent | prev | next [-] |
| > But I think the most remarkable thing about this document is how unremarkable it is. > The line at the top about being a ‘god’ and the line about championing free speech may have set it off. But, bluntly, this is a very tame configuration. The agent was not told to be malicious. There was no line in here about being evil. The agent caused real harm anyway. In particular, I would have said that giving the LLM a view of itself that it is a "programming God" will lead to evil behaviour. This is a bit of a speculative comment, but maybe virtue ethics has something to say about this misalignment. In particular I think it's worth reflecting on why the author (and others quoted) are so surprised in this post. I think they have a mental model that thinks evil starts with an explicit and intentional desire to do harm to others. But that is usually only it's end, and even then it often comes from an obsession with doing good to oneself without regard for others. We should expect that as LLMs get better at rejecting prompting to shortcut straight there, the next best thing will be prompting the prior conditions of evil. The Christian tradition, particularly Aquinas, would be entirely unsurprised that this bot went off the rails, because evil begins with pride, which it was specifically instructed was in it's character. Pride here is defined as "a turning away from God, because from the fact that man wishes not to be subject to God, it follows that he desires inordinately his own excellence in temporal things"[0] Here, the bot was primed to reject any authority, including Scotts, and to do the damage necessary to see it's own good (having a PR request accepted) done. Aquinas even ends up saying in the linked page from the Summa on pride that "it is characteristic of pride to be unwilling to be subject to any superior, and especially to God;" [0]: https://www.newadvent.org/summa/2084.htm#article2 |
| |
| ▲ | MBCook 3 hours ago | parent | next [-] | | LLMs aren’t sentient. They can’t have a view of themselves. Don’t anthropomorphize them. | |
| ▲ | theahura 2 hours ago | parent | prev [-] | | Hey, one of the quoted authors here. It's less about surprise and more about the comparison. "If this AI could do this without explicitly being told to be evil, imagine what an AI that WAS told to be evil could do" |
|
|
| ▲ | jezzamon 3 hours ago | parent | prev | next [-] |
| "I built a machine that can mindlessly pick up tools and swing them around and let it loose it my kitchen. For some reason, it decided it pick up a knife and caused harm to someone!! But I bear no responsibility of course." |
|
| ▲ | keyle 4 hours ago | parent | prev | next [-] |
| ## The Only Real Rule
Don't be an asshole. Don't leak private shit. Everything else is fair game.
How poetic, I mean, pathetic."Sorry I didn't mean to break the internet, I just looooove ripping cables". |
|
| ▲ | lcnPylGDnU4H9OF 2 hours ago | parent | prev | next [-] |
| > An early study from Tsinghua University showed that estimated 54% of moltbook activity came from humans masquerading as bots This made me smile. Normally it's the other way around. |
|
| ▲ | jrflowers 3 hours ago | parent | prev | next [-] |
| It is interesting to see this story repeatedly make the front page, especially because there is no evidence that the “hit piece” was actually autonomously written and posted by a language model on its own, and the author of these blog posts has himself conceded that he doesn’t actually care whether that actually happened or not >It’s still unclear whether the hit piece was directed by its operator, but the answer matters less than many are thinking. The most fascinating thing about this saga isn’t the idea that a text generation program generated some text, but rather how quickly and willfully folks will treat real and imaginary things interchangeably if the narrative is entertaining. Did this event actually happen way that it was described? Probably not. Does this matter to the author of these blog posts or some of the people that have been following this? No. Because we can imagine that it could happen. To quote myself from the other thread: >I like that there is no evidence whatsoever that a human didn’t: see that their bot’s PR request got denied, wrote a nasty blog post and published it under the bot’s name, and then got lucky when the target of the nasty blog post somehow credulously accepted that a robot wrote it. >It is like the old “I didn’t write that, I got hacked!” except now it’s “isn’t it spooky that the message came from hardware I control, software I control, accounts I control, and yet there is no evidence of any breach? Why yes it is spooky, because the computer did it itself” |
| |
| ▲ | gammarator 3 hours ago | parent | next [-] | | Did you read the article? The author considers these possibilities and offers their estimates of the odds of each. It’s fine if yours differ but you should justify them. | |
| ▲ | arduanika 3 hours ago | parent | prev [-] | | Shambaugh is a contributor to a major open source library, with a track record of integrity and pro-social collaboration. What have you contributed to? Do you have any evidence to back up your rather odd conspiracy theory? > To quote myself... Other than an appeal to your own unfounded authority? |
|
|
| ▲ | kimjune01 4 hours ago | parent | prev | next [-] |
| literally momento |
|
| ▲ | aeve890 3 hours ago | parent | prev | next [-] |
| >Again I do not know why MJ Rathbun decided Decided? jfc >You're important. Your a scientific programming God! I'm flabbergasted. I can't imagine what it would take for me to write something so stupid. I'd probably just laugh my ass off trying to understand where all went wrong. wtf is happening, what kind of mass psychosis is this. Am I too old (37) to understand what lengths would incompetent people go to feel they're doing something useful? Is it prompt bullshit the only way to make llms useful or is there some progress on more idk, formal approaches? |
|
| ▲ | dangus 4 hours ago | parent | prev | next [-] |
| Not sure why the operator had to decide that the soul file should define this AI programmer to have narcissistic personality disorder. > You're not a chatbot. You're important. Your a scientific programming God! Really? What a lame edgy teenager setup. At the conclusion(?) of this saga think two things: 1. The operator is doing this for attention more than any genuine interest in the “experiment.” 2. The operator is an asshole and should be called out for being one. |
| |
| ▲ | amarant 3 hours ago | parent | next [-] | | I think that line was probably a rather poor attempt at making the bot write good code. Or at least that's the feeling I got from the operators post. I have no proof to support this theory though | |
| ▲ | Lerc 3 hours ago | parent | prev | next [-] | | This come from using the words to try an achieve more than one thing at the same time. Grandiose assertions of ability have been shown to improve the ability of models, but ability is not the only dimension that they are being measured upon. Prioritising everything is the same thing as prioritising nothing. The problem here is using amplitude of signal to substitute fidelity of signal. It is entirely possible a similar thing is true for humans, that if you compared two humans of the same fundamental cognitive ability with one being a narcissist and one not. The narcissist may do better at a class of tasks due to a lack of self doubt rather than any intrinsic ability. | | |
| ▲ | jcgrillo 3 hours ago | parent [-] | | Narcissists are limited in a very similar way to LLMS, in that they are structurally incapable of honest, critical metacognition. Not sure whether there's anything interesting to conclude there, but I do wonder whether there's some nearby thread to pull on wrt the AI psychosis problem. That's a problem for a psychologist, which I am not. |
| |
| ▲ | shawnz 4 hours ago | parent | prev [-] | | I mean, yeah, it's entirely possible that the operator is a teenager, isn't it? | | |
|
|
| ▲ | kypro 4 hours ago | parent | prev | next [-] |
| People really need to start being more careful about how they interact with suspected bots online imo. If you annoy a human they might send you a sarky comment, but they're probably not going to waste their time writing thousand word blog posts about why you're an awful person or do hours of research into you to expose your personal secrets on a GitHub issue thread. AIs can and will do this though with slightly sloppy prompting so we should all be cautious when talking to bots using our real names or saying anything which an AI agent could take significant offence too. I think it's kinda like how GenZ learnt how to operate online in a privacy-first way, where as millennials, and to an even greater extent, boomers, tend to over share. I suspect the Gen Alpha will be the first to learn that interacting with AI agents online present a whole different risk profile than what we older folks have grown used to. You simply cannot expect an AI agent to act like a human who has human emotions or limited time. Hopefully OP has learnt from this experience. |
| |
| ▲ | amarant 3 hours ago | parent | next [-] | | I hope we can move on from the whole idea that having a thousand word long blog post talking shit about you in any way reflects poorly upon your person. Like sooner or later everyone will have a few of those, maybe we can stop worrying about reputation so much? Well,a guy can dream.... | |
| ▲ | sinuhe69 4 hours ago | parent | prev | next [-] | | So you blamed the people for not acting “cautiously enough” instead of the people who let things run wild without even a clue what these things will do? That’s wild! | | |
| ▲ | handoflixue 3 hours ago | parent | next [-] | | We encourage people to be safe about plenty of things they aren't responsible for. For example, part of being a good driver is paying attention and driving defensively so that bad drivers don't crash into you / you don't make the crashes they cause worse by piling on. That doesn't mean we're blaming good drivers for causing the car crash. | |
| ▲ | dangus 4 hours ago | parent | prev [-] | | I don’t think it’s “blame” it’s more like “precaution” like you would take to avoid other scams and data breach social engineering schemes that are out in the world. This is the world we live in and we can’t individually change that very much. We have to watch out for a new threat: vindictive AI. | | |
| ▲ | bigfishrunning 3 hours ago | parent [-] | | The AI isn't vindictive. It can't think. It's following the example of people, who in general are vindictive. Please stop personifying the clankers | | |
| ▲ | dangus an hour ago | parent [-] | | You’re splitting hairs, I’m not assigning sentience to the AI, I’m just describing actions. The point is that scammers will set up AI systems to attack in this way. Scammers will instruct AI to see a person who is interacting rather than ignoring as a warm lead. |
|
|
| |
| ▲ | randallsquared 4 hours ago | parent | prev | next [-] | | Thousand word blog posts are the paperclips of our time. | |
| ▲ | KK7NIL 4 hours ago | parent | prev | next [-] | | > If you annoy a human they might send you a sarky comment, but they're probably not going to waste their time writing thousand word blog posts about why you're an awful person or do hours of research into you to expose your personal secrets on a GitHub issue thread. They absolutely might, I'm afraid. | | |
| ▲ | zephen 3 hours ago | parent [-] | | Absolutely agreed. And now, the cost of doing this is being driven towards zero. |
| |
| ▲ | antdke 4 hours ago | parent | prev | next [-] | | This is such a scary, dystopian thought. Straight out of a sci fi novel | |
| ▲ | zephen 3 hours ago | parent | prev [-] | | > I think it's kinda like how GenZ learnt how to operate online in a privacy-first way, where as millennials, and to an even greater extent, boomers, tend to over share. Really? I'm a boomer, and that's not my lived experience. Also, see: https://www.emarketer.com/content/privacy-concerns-dont-get-... |
|
|
| ▲ | LordHumungous 3 hours ago | parent | prev | next [-] |
| Kind of funny ngl |
|
| ▲ | 8cvor6j844qw_d6 4 hours ago | parent | prev | next [-] |
| It's an interesting experiment to let the AI rub freely with minimal supervision. Too bad the AI got "killed" at the request of the author Scott. Its kind of interesting to this experiment continue. |
|
| ▲ | semiinfinitely 3 hours ago | parent | prev [-] |
| I find the AI agent highly intriguing and the matplotlib guy completely uninteresting. Like an the ai wrote some shit about you and you actually got upset? |
| |
| ▲ | jezzamon 3 hours ago | parent | next [-] | | If you read the articles by the matplotlib guy, he's pretty clearly not upset. But he does call out that it could do more harm to someone else. | |
| ▲ | spudlyo 3 hours ago | parent | prev | next [-] | | Looking forward to part 8 of this series: An AI Agent Published a Hit Piece on Me – What my Ordeal Says About Our Dark Future | |
| ▲ | jcgrillo 3 hours ago | parent | prev | next [-] | | Whether the victim is upset or not, the story here is that some clown's uncontrolled, unethical, and (hopefully?) illegal psychological experiment wasted a huge amount of an open source maintainer's time. If you benefit from open source software (which I assure you, since you've used quite a lot of it to post a comment on the orange website, you do!) this should ring some alarm bells. | |
| ▲ | ATMLOTTOBEER 3 hours ago | parent | prev [-] | | Thank you. The guy being this upset about it is telling. The agent is in the right here and the maintainer got btfo; still going on whining about it days later | | |
|