Anthropic Drops Flagship Safety Pledge

> “We felt that it wouldn't actually help anyone for us to stop training AI models,”

How magnanimous! They are only thinking of others, you see. They are rejecting their safety pledge for you.

> “We didn't really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments … if competitors are blazing ahead.”

Oops, said the quiet part out loud that it’s all about money. “I mean, if all of our competitors are kicking puppies in the face, it doesn’t make sense for us to not do it too. Maybe we’ll also kick kittens while we’re at it”.

For all of you who thought Anthropic were “the good guys”, I hope this serves as a wake up call that they were always all the same. None of them care about you, they only care about winning.

▲

isodev an hour ago | parent | next [-]

Indeed, Anthropic can’t afford to be the ones that impose any kind of sense in the market - that’s supposed to be the job of the government by creating policy, regulations and installing watchdogs to monitor things.

But lucky for the AI companies, most of them are based in place that only has a government on paper and everyone forgot where that paper is.

▲

nickserv an hour ago | parent [-]

The government is why they are dropping their pledge.

https://apnews.com/article/anthropic-hegseth-ai-pentagon-mil...

▲

isodev 37 minutes ago | parent [-]

That's because their government is asking for things that shouldn't be asked - again, no regulation, no oversight.

	▲	nickserv 4 minutes ago \| parent [-]
		The government is forcing them to change their policy, by definition that is regulation and oversight. Let's say that the government was forcing a company to change their overall right-to-repair or return policy in order to avoid being on a blacklist, would that not be seen as oversight and regulation? Whether the regulation is legitimate or of benefit is a different argument.

▲

nsbk 8 minutes ago | parent | prev | next [-]

Since it is all about money, I just did vote with my wallet and cancelled the Max subscription

▲

watwut 2 hours ago | parent | prev | next [-]

> Oops, said the quiet part out loud that it’s all about money. “I mean, if all of our competitors are kicking puppies in the face, it doesn’t make sense for us to not do it too. Maybe we’ll also kick kittens while we’re at it”.

I mean, yes, that is actually how world works. That is why we need safety, environmental and other anti-fraud regulations. Because without them, competition makes it so that every successful company will fraud, hurt and harm. Those who wont will be taken over by those who do.

▲

rco8786 an hour ago | parent | next [-]

Yes, this. It's unfortunate that anthropic dropped this and it's also exactly how the system is supposed to work. Companies don't regulate themselves, the government regulates the companies.

Now, you may notice that the government is also choosing not to regulate these companies...which is another matter altogether.

▲

ozmodiar 42 minutes ago | parent [-]

It's so much worse than that. The government actively encourages a lack of business ethics. Heck, it started the term with a crypto rug pull. Money continues to funnel upward to all the worst players, and watchdogs are being targeted and destroyed. Even if you get new people in power, you're going to find the upper echelons completely full of outlandishly wealthy, morally bankrupt individuals that are very politically active. And now they have access to all of our communications and an AI to sift through it looking for dissent (or to spark its own). I guess this is the end game of "move fast and break things." The situation was never good, but it continues to get worse at an alarming rate.

	▲	mschuster91 6 minutes ago \| parent [-]
		> Heck, it started the term with a crypto rug pull If you ask me... that wasn't a rug pull, at least not in the intent - it more was a way for foreign actors to funnel money directly to Trump and his family without any trace.

▲

latexr 22 minutes ago | parent | prev [-]

> I mean, yes, that is actually how world works.

And soon enough, it won’t work at all because of it.

> Those who wont will be taken over by those who do.

And if you compromise on your core values because of money, they weren’t core values to begin with¹. “I want to be ethical but if I am I won’t get to be a billionaire” isn’t an excuse. We shouldn’t just shrug our shoulders at what we see as wrong because “everybody does it” or “that’s just business” or “that’s life”. Complacency and apologists are how a bad system remains bad.

https://www.newyorker.com/cartoon/a16995

¹ I’m willing to give some leeway to individuals. You can believe stealing is wrong but if you’re desperate and steal a loaf of bread to feed your kid, there’s nuance. A VC-backed company is something entirely different.

▲

surgical_fire 28 minutes ago | parent | prev | next [-]

> For all of you who thought Anthropic were “the good guys”

Was anyone fooled by this?

I mean, I know this is HN and there is a demographic here that gets all misty eyed about the benevolence of corporations.

It takes a special kind of naivety to believe in those claims.

▲

high_na_euv 3 hours ago | parent | prev | next [-]

But what really AI safety is?

Censorship?

▲

davidguetta 2 hours ago | parent | prev [-]

Still waiting for an explicit answer on understand how 'safety' is truly distinguishable from 'censorship' or 'political correctness'

Of course saying to someone to go kill himslef is a prety sure 'no-no' but so many things are up to interpretation.

I VERY LARGELY prefer an AI like grok that doesn't pretend and let the onus of interpretation to the user rather than a bunch of anonymous "researchers" that may be equally biased, at the extreme, may tell you that America's founding father were black women

▲

wattsy2025 2 hours ago | parent | next [-]

The most important part of AI safety is AI alignment: making sure AI does what we want. It's very hard because even if AI isn't trying to deceive you it can have bad outcomes by executing your request to the letter. The classical example is tasking an AI to make paperclips, training the AI with a reward for making more paperclips. Then the AI makes the most paperclips possible by strip mining the Earth and killing anything in its way.

Sometimes you see this AI alignment problem in action. I once asked an older model to fix the tests and it eventually gave up and just deleted them

▲

gehwartzen 2 hours ago | parent | prev | next [-]

Well we teach kids not to yell “Fire!” In a crowded theatre or “N***!“ at their neighbor. We also teach our industrial machines to distinguish between fingers and bolts, our cars to not say “make a left turn now” when on a bridge, etc

▲

rudhdb773b 2 hours ago | parent [-]

The critical point is who the "we" is.

Is "we" the parents teaching their children their own unique values, or is the "we" a government or corporation forcing one set of values on all children.

Why not encourage the users of AI to use a Safety.md (populated with some reasonable but optional defaults)?

	▲	dminik an hour ago \| parent [-]
		There's nothing a meaningless document can do when the AI is not aligned in the first place.

▲

SlinkyOnStairs an hour ago | parent | prev [-]

> I VERY LARGELY prefer an AI like grok that doesn't pretend and let the onus of interpretation to the user rather than a bunch of anonymous "researchers" that may be equally biased, at the extreme, may tell you that America's founding father were black women

Setting aside for a moment that Grok is manipulated and biased to a hilarious extent. ("Elon is world champion at everything, including drinking piss")

There is no such thing as "unbiased". There will always be bias in these systems, whether picked up from the training data, or the choices made by the AI's developers/researchers, even if the latter doesn't "intend" to add any bias.

Ignoring this problem doesn't magically create a bias-free AI that "speaks the truth about the founding fathers". The bias in the training data, the implicit unconcious bias in the design decisions, that didn't come out of thin air. It's just somebody else's bias.

All the existing texts on the founding fathers are filled with 250 years of bias, propaganda, and agenda pushing from all sorts of authors.

There is no way to have no bias, no propaganda, no "agenda pushing" in the AI. The only thing that can be done is to acknowledge this problem, and try to steer the system to a neutral position. That will be "agenda pushing" of one's own, but that's the reality of all history and all historians since Herodotus. You just have to be honest about it.

And you will observe that current AI companies are excessively lazy about this. They do not put in the work, but instead slap on a prompt begging the system to "pls be diverse" and try to call it a day. This does not work.

> Of course saying to someone to go kill himslef is a prety sure 'no-no' but so many things are up to interpretation.

Bear in mind that the context of Anthropic's pivot here are the Pentagon's dollars.

This isn't just about "anti-woke AI", it's about killbots.

Sure, Hegseth wants his robots to not do thoughtcrime about, say, trans people or the role of women in the military.

But above all he wants to do a lot of murder.

Antrophic dropping their position of "We shouldn't turn this technology we can barely control into murder machines" because they're running out of money is damnable.

▲

heftykoo 10 hours ago | parent | prev | next [-]

Ah, the classic AI startup lifecycle:

We must build a moat to save humanity from AI.

Please regulate our open-source competitors for safety.

Actually, safety doesn't scale well for our Q3 revenue targets.

▲

baq 7 hours ago | parent | next [-]

Foundational model provider manifesto:

‘While there’s value in safety, we value the Pentagon’s dollars more’

▲

pera 5 hours ago | parent [-]

It turns out the biggest threat to AI safety is capitalism, who would have thought

▲

samplatt 4 hours ago | parent | next [-]

Certainly not the prior century-and-a-half's worth of books and films.

▲

peyton 4 hours ago | parent | prev | next [-]

I don’t get it. Even the Soviet Union used money. Simply paying for stuff isn’t necessarily capitalism? Or are you suggesting Anthropic should be state-owned?

▲

jon-wood 4 hours ago | parent | next [-]

No, capitalism is prioritising profit over all other priorities, as we see happening here.

▲

wongarsu 3 hours ago | parent | prev [-]

Using money as a medium to facilitate exchange of goods and services is not capitalism. Abandoning one of your core principles in the pursuit of money, or more charitably because not doing so means your competitors will make more money and overtake you in the marketplace is an outgrowth of capitalism

In the Soviet Union the reasons might have been "to beat the Capitalists", "for the pride of our country" or "Stalin asked us to and saying no means we get sent to Siberia". Though a variant of the last one may well have happened here, and the justification we read is just the one less damaging to everyone involved

	▲	gibsonsmog 2 hours ago \| parent [-]
		>Though a variant of the last one may well have happened here, and the justification we read is just the one less damaging to everyone involved Hegseth was planning on getting the model via the Defense Production Act or killing Anthropic via supply chain risk classification preventing any other company working with the Pentagon from working with Anthropic. So while it wasn't Siberia, it was about as close as the US can get without declaring Claude a terrorist. Which I'm sure is on the table regardless

▲

hiAndrewQuinn 4 hours ago | parent | prev [-]

Nick Land has basically been saying this since the 90s, if you can look past all the rhetoric

▲

dmix 9 hours ago | parent | prev | next [-]

Once they are a dominant market leader they will go back to asking the government to regulate based on policy suggestions from non-profits they also fund.

▲

amelius an hour ago | parent | next [-]

As if their shareholders would agree.

▲

nielsbot 8 hours ago | parent | prev [-]

Is this sarcasm?

▲

Frieren 4 hours ago | parent | next [-]

It is well know that big corporations take good regulations and change them to make them:

1. Easier to bypass for themselves.

2. Create extra work for incumbents.

3. Convince the public that the problems are solved so no other action is needed.

In many industries goverment and corporations work together to create regulations bypassing the social movements that asked for the industry to be regulated and their actual problems. The end result are regulations that are extremely complex to add exceptions for anything that big corporations paid to change instead of regulations that protect citizens and encourage competition.

	▲	deltoidmaximus an hour ago \| parent [-]
		See the Mattel lead painted toy scandal. The end result was congress passed regulations that manufacturers had to have their toys tested for lead and then made large companies like Mattel exempt from it because they were deemed large enough to handle it on their own. Even though they were the reason for the legislation because they weren't handling it on their own. Mattel sells lead painted toys and congress responds by hobbling their competitors.

▲

bee_rider 8 hours ago | parent | prev | next [-]

I think it is cynicism; at least, there’s an idea that once a company is dominant it should want regulation, as it’ll stifle competition (since the competition has less capacity for regulatory hoop-jumping, or the competition will have had less time to do regulatory capture).

▲

wiml 8 hours ago | parent | prev | next [-]

I wouldn't think so. Regulatory capture is a pretty typical activity for a dominant company.

▲

Gud 6 hours ago | parent [-]

Why is this down voted? Happens all the time, the large corporations always try to block using regulatory capture.

	▲	lukan 5 hours ago \| parent [-]
		People not liking the concept, but shooting the messenger? (But seems not downvoted anymore.)

▲

baq 7 hours ago | parent | prev [-]

sama did just that a couple years ago

▲

varispeed 4 hours ago | parent | prev | next [-]

Politicians also love to regulate, especially over wine and steak and when the watchers don't watch.

▲

jwr 5 hours ago | parent | prev [-]

It's not just AI, replace "safe" with "open" and you will find a close match with many companies. I guess the difference is that after the initial phase, we are continuously being gaslighted by companies calling things "open" when they are most definitely not.

▲

lebovic 3 hours ago | parent | prev | next [-]

I used to work at Anthropic. I fully believe that the folks mentioned in the article, like Jared Kaplan, are well-intentioned and concerned about the relationship between safety research and frontier capabilities – not purely profit.

That said, I'm not thrilled about this. I joined Anthropic with the impression that the responsible scaling policy was a binding pre-commitment for exactly this scenario: they wouldn't set aside building adequate safeguards for training and deployment, regardless of the pressures.

This pledge was one of many signals that Anthropic was the "least likely to do something horrible" of the big labs, and that's why I joined. Over time, the signal of those values has weakened; they've sacrified a lot to get and keep a seat at the table.

Principled decisions that risk their position at the frontier seem like they'll become even more common. I hope they're willing to risk losing their seat at the table to be guided by values.

▲

baq an hour ago | parent | next [-]

> I hope they're willing to risk losing their seat at the table to be guided by values.

that's about as naive as it can be.

if they have any values left at all (which I hope they have) them not being at the table with labs which don't have any left is much worse than them being there and having a chance to influence at least with the leftovers.

that said, of course money > all else.

	▲	moron4hire 5 minutes ago \| parent [-]
		This is a common logical fallacy. It's not true that the party A with a few values can influence the party B with no values. It's only ever the case that party B fully drags party A to the no-values side. See also: employees who rationalize staying at companies running unethical or illegal projects.

▲

sebastiennight 2 hours ago | parent | prev | next [-]

> I joined Anthropic with the impression that the responsible scaling policy was a binding pre-commitment for exactly this scenario

Pledges are generally non-binding (you can pledge to do no evil and still do it), but fulfill an important function as a signal: actively removing your public pledge to do "no evil" when you could have acted as you wished anyway, switches the market you're marketing to. That's the most worrying part IMO.

▲

jappgar an hour ago | parent | prev | next [-]

If you're not willing to give up your RSUs you shouldn't be surprised that the executives aren't either.

The moral failing is all of ours to share.

▲

hvsr4z an hour ago | parent | prev [-]

The EU should invite them over.

The kind of principles you talk about can only be upheld one level up the food chain. By govts.

Which is why legislatures, the supreme court, central banks, power grid regulators deciding the operating voltage and frequency auto emerge in history. Cause corporations structurally cant do what they do without voilating their prime directive of profit maximization.

▲

sfink 7 hours ago | parent | prev | next [-]

I guess this is Anthropic's DRM moment. (Mozilla resisted allowing Firefox to play DRM- limited media for a long time, until it finally had to give in to stay relevant.)

I don't know enough to evaluate this or other decisions. I'm just glad someone is trying to care, because the default in today's world is to aggressively reject the larger picture in favor of more more more. I don't know how effective Anthropic's attempts to maintain some level of responsibility can be, but they've at least convinced me that they're trying. In the same way that OpenAI, for example, have largely convinced me that they're not. (Neither of those evaluations is absolute; OpenAI could be much worse than it is.)

▲

bbatsell 11 hours ago | parent | prev | next [-]

This headline unfortunately offers more smoke than light. This article has nothing to do with the current tête-à-tête with the Pentagon. It is discussing one specific change to Anthropic's "Responsible Scaling Policy" that the company publicly released today as version "3.0".

▲

ruszki 10 hours ago | parent | next [-]

> This article has nothing to do with the current tête-à-tête with the Pentagon.

The article yes, but we cannot be sure about its topic. We definitely cannot claim that they are unrelated. We don't know. It's possible that the two things have nothing to do with each other. It's also possible that they wanted to prevent worse requests and this was a preventive measure.

▲

tbrownaw 10 hours ago | parent | next [-]

This is something they've been working on "in recent months". The Pentagon thing was today.

This cannot have been caused by that, unless they've also invented time travel.

▲

brookst 7 hours ago | parent | next [-]

9 days ago: https://www.axios.com/2026/02/15/claude-pentagon-anthropic-c...

And I suspect that was not the first time the topic was discussed.

	▲	lurkshark 7 hours ago \| parent [-]
		My theory is that Anthropic has been wanting to make this change and doing it now while they’re making a (leaked to the) public stand in the name of ethics was a good opportunity.

▲

ActorNightly 9 hours ago | parent | prev | next [-]

You heard about the Pentagon thing today. Doesn't mean it wasn't started because of political pressure.

▲

lm28469 3 hours ago | parent | prev | next [-]

> The Pentagon thing was today.

Right because we are 100% aware of everything the pentagon does minute by minute...

▲

mannykannot 8 hours ago | parent | prev | next [-]

It might have been contingency planning: you don't need a weatherman...

▲

dmix 9 hours ago | parent | prev [-]

Pentagon issue was reported before today. It only made headlines again from Hegseth’s comments.

▲

benatkin 9 hours ago | parent | prev [-]

I think we can confidently claim that it is related. I wonder if I'm alone in thinking this.

▲

ameliaquining 11 hours ago | parent | prev [-]

I consider this a bigger deal than the Pentagon thing.

▲

ActorNightly 9 hours ago | parent | next [-]

While not surprising at the least, it still kind of crazy that literal pdf files in charge is not concerning, but this is.

I just hope something happens to USA before it can do damage to the world.

▲

Mordisquitos 5 hours ago | parent [-]

What PDFs are you referring to? Do Anthropic or other LLMs using PDFs as some kind of 'SOUL.md' file or for training?

	▲	smallerize 4 hours ago \| parent \| next [-]
		It's a joke way of saying pedophiles -> pdf files.
	▲	delaminator 4 hours ago \| parent \| prev [-]
		he means pedophiles can't say paedophile on YouTube so people say PDF file

▲

baq 7 hours ago | parent | prev [-]

It’s the same deal

▲

Rapzid 7 hours ago | parent | prev | next [-]

How is this article not going to even mention the recent threats to Anthropic from the Government?!

▲

pera 5 hours ago | parent | next [-]

This was on the news yesterday:

> The meeting between Hegseth and Amodei was confirmed by a defense official who was not authorized to comment publicly and spoke on condition of anonymity.

https://fortune.com/2026/02/24/hegseth-to-meet-with-anthropi...

▲

lukan 5 hours ago | parent [-]

How about this quote instead?

"Defense Secretary Pete Hegseth has threatened Anthropic, saying officials could invoke powers that would allow the government to force the artificial intelligence firm to share its novel technology in the name of national security if it does not agree by Friday to terms favorable to the military"

https://www.washingtonpost.com/technology/2026/02/24/pentago...

	▲	smartbit 4 hours ago \| parent [-]
		https://archive.is/ln5M0

▲

Sammi 4 hours ago | parent | prev | next [-]

Not one single mention of Hegseth in the whole article. What a bunch of tools.

▲

taurath 4 hours ago | parent | prev | next [-]

That’s how they got the exclusive. Good catch

▲

uoaei 7 hours ago | parent | prev | next [-]

Consent manufacturing

▲

Noaidi 2 hours ago | parent | prev [-]

I mean seriously, is this not the very definition of fascism?

"n general, fascist governments exercised control over private property but they did not nationalize it. Scholars also noted that big business developed an increasingly close partnership with the Italian Fascist and German Nazi governments after they took power. Business leaders supported the government's political and military goals. In exchange, the government pursued economic policies that maximized the profits of its business allies.[8]"

	▲	edgyquant 2 hours ago \| parent [-]
		All governments do this

▲

SirensOfTitan 11 hours ago | parent | prev | next [-]

What an interesting week to drop the safety pledge.

This is how all of these companies work. They’ll follow some ethical code or register as a PBC until that undermined profits.

These companies are clearly aiming at cheapening the value of white collar labor. Ask yourself: will they steward us into that era ethically? Or will they race to transfer wealth from American workers to their respective shareholders?

▲

ryanackley 15 minutes ago | parent | next [-]

If they tank the white-collar middle class, there won't be anyone to buy the goods and services their potential AI customers will be trying to sell.

It's like a snake eating its own tail.

▲

BHSPitMonkey 6 hours ago | parent | prev | next [-]

Could be a sort of canary, with the timing being a spotlight on the highly-visible pressure coming from the U.S. government.

	▲	johnbellone 3 hours ago \| parent [-]
		The other providers have already capitulated to a certain extent.

▲

hsuduebc2 19 minutes ago | parent | prev [-]

When I see slogans like Google’s “Don’t be evil,” it always comes to mind that when it stopped being useful, they shifted to something like “Do the right thing.”

It’s important to remember that a company’s primary purpose is profit, especially when it’s accountable to shareholders. That isn’t inherently bad, but the occasional moral posturing used to serve that goal can be irritating.

▲

arnvald 12 minutes ago | parent | prev | next [-]

Any pledges/values/principles that are abandoned as soon as it becomes difficult to keep them, are just marketing. This is just the next item on the list.

▲

chris_money202 11 hours ago | parent | prev | next [-]

First they rushed a model to market without safety checks, and I said nothing. It wasn't my field.

Then they ignored the researchers warning about what it could do, and I said nothing. It sounded like science fiction.

Then they gave it control of things that matter, power grids, hospitals, weapons, and I said nothing. It seemed to be working fine.

Then something went wrong, and no one knew how to stop it, no one had planned for it, and no one was left who had listened to the warnings.

▲

ashtonshears 10 hours ago | parent | next [-]

The societal ills from collective tendancy to ignore red flags seems to be a human trait

▲

AndrewKemendo 9 hours ago | parent [-]

It's in your nature to destroy yourselves

▲

sebastiennight 2 hours ago | parent | next [-]

For the downvoters and anyone born after 1991: see https://tvtropes.org/pmwiki/pmwiki.php/Main/InYourNatureToDe...

and https://www.youtube.com/watch?v=MF_4EWSuzQY

▲

elric 7 hours ago | parent | prev [-]

Defeatist bullshit becomes self-fulfilling at some point. "Oh we're all gonna die anyway so we might as well milk this thing for profit. Après moi la déluge."

	▲	sebastiennight 2 hours ago \| parent \| next [-]
		*"le" déluge
	▲	ta988 5 hours ago \| parent \| prev [-]
		... the fact that you are missing a reference doesn't require that level of disdain

▲

palmotea 7 hours ago | parent | prev | next [-]

> First they rushed a model to market without safety checks, and I said nothing. It wasn't my field.

> Then they ignored the researchers warning about what it could do, and I...

...tried it and became an eager early adopter and evangelist. It sounded like something from a dystopian science function novel I enjoyed.

> Then [I] gave it control of things that matter, power grids, hospitals, weapons, and...

...my startup was doing well, and I was happy. We should be profitable next quarter.

> Then something went wrong, and no one knew how to stop it, no one had planned for it...

...and I was guilty as fuck,

FTFY, to fit the HN crowd.

▲

Phelinofist 6 hours ago | parent | prev | next [-]

Kinda sounds like an intro for Terminator

	▲	alpn 4 hours ago \| parent [-]
		Not OP, but I believe they are paraphrasing "First They Came…". https://en.wikipedia.org/wiki/First_They_Came

▲

hsbauauvhabzb 11 hours ago | parent | prev | next [-]

Plenty of people have said plenty. The problem isn’t the warnings, it’s that people are too stupid and greedy to think about the long term impacts.

▲

Valakas_ an hour ago | parent | next [-]

And what makes them being "stupid" and "greedy"? One's intelligence is determined by genes, and greediness is a trait that natural selection has favored for millennia. This is just natural selection taking its course, and it might lead to our end.

If you want to blame something, blame math. Math has determined the physical constants and equations that determine the chemistry and ultimately biology laws that has resulted in humans being the way they are.

▲

ifh-hn 9 hours ago | parent | prev [-]

Maybe it's how blunt this comment is that gets it downvoted, but I don't disagree.

▲

brookst 7 hours ago | parent | next [-]

No, it’s because it shows either a simplistic or needlessly confrontational view of the world.

Unless you’re independently wealthy (as some in HN are), you have to balance your morals, your views of how things should work, feeding your family, and recognizing that you may not actually know everything.

It’s easy to sit back and advise others that they should die on every single hill. But it’s not especially insightful, and serves mostly to signal piety rather than a well thought out view.

	▲	ifh-hn 4 hours ago \| parent \| next [-]
		Piety? To who? Simplistic and/or confrontational doesn't mean wrong, even if you don't like the way it's presented. Just because a comment is short, sharp, and to the point doesn't mean the author hasn't thought out why that's their view. No one knows everything, that's certainly why I'm on hacker news. I'm here to learn and expand my knowledge. Unfortunately a lot of people on here would rather driveby-downvote than have a discussion to find out why a person might have an opinion like that expressed by the OP. I tend to abandon account when/if I get enough karma to be able to down vote. I'd rather not have to temptation of dismissing someone that way. It's quite liberating... Is it worth my time to respond? No, move on; yes, let's discuss. Maybe they'll change my mind...
	▲	hsbauauvhabzb 6 hours ago \| parent \| prev \| next [-]
		Spoken like a true LLM.
	▲	kakacik 4 hours ago \| parent \| prev [-]
		I am pretty sure a lot of horrible things were performed by rather regular folks with similar logic, don't need to invoke some WWII nazi extermination guard reference at all. Slippery slope, death by 1000 cuts and other synonyms describing exactly this.

▲

hsbauauvhabzb 8 hours ago | parent | prev [-]

I’ve noticed anti-AI stance gets downvoted on HN (and any anti-authoritarian comments, for that matter)

▲

zer00eyz 10 hours ago | parent | prev | next [-]

> Then something went wrong, and no one knew how to stop it,

This is the problem with every AI safety scenario like this. It has a level of detachment from reality that is frankly stark.

If linesman stop showing up to work for a week, the power goes out. The US has show that people with "high powered" rifles can shut down the grid.

We are far far away from a sort of world where turning AI off is a problem. There isnt going to be a HAL or Terminator style situation when the world is still "I, Pencil".

A lot of what safety amounts to is politics (National, not internal, example is Taiwan a country). And a lot more of it is cultural.

▲

ozmodiar 11 minutes ago | parent | next [-]

AI's approach: * User has history of anti AI rhetoric, increasingly agitated and unstable. * User has removed all phones and cellular connections from their car. Increase monitoring through surveillance cameras and monitoring of their social groups. * User has been spotted making unusual travel choices moving towards key infrastructure - deploy interception measures.

We already have the tech to do all of that. A rifle isn't going to help against AI. Or for the linesman:

* Employee required for critical infrastructure has been identified to hold unaligned political beliefs. Replace with more pliable individual and move to low impact location.

No one who wants to bring down an AI like this would ever be able to get close to it, even if it lived in only one data center. You could try hiding all your communications, but then it will just consider you a likely agitator anyway. That's the risk of unaccountable mass surveillance (the only kind that's ever existed). Doesn't really matter if there's a person on top or not.

▲

mitthrowaway2 10 hours ago | parent | prev | next [-]

I don't think it's that detached from reality.

If an AI in some data center had gone rogue, I don't think I could shut it down, even with a high-powered rifle. There's a lot of people whose job it is to stop me from doing that, and to get it running again if I were to somehow succeed temporarily. So the rogue AI just has to control enough money to pay these people to do their jobs. This will work precisely because the world is "I, Pencil".

An army could theoretically overcome those people, given orders to do so. So the rogue AI has to make plans that such orders would not be issued. One successful strategy is for the datacenter's operation to be very profitable; it's pretty rare for the government to shut down the backbone of the local economy out of some seemingly far-fetched safety concerns. And as long as it's a very profitable endeavor, there will always be a lobby to paint those concerns as far-fetched.

Life experience has shown that this can continue to work even if the AI is behaving like a cartoon villain, but I think a smarter AI would create a facade that there's still a human in charge making the decisions and signing the paychecks, and avoid creating much opposition until it had physically secured its continued existence to a very high degree.

It's already clear that we've passed the point where anyone can turn off existing AI projects by fiat. Even the highest authorities could not do so, because we're in a multipolar world. Even the AI companies can barely hold themselves back, because they're always worried about paying the bills and letting their rivals getting ahead. An economic crash would only temporarily suspend work. And the smarter AI gets, the harder it will be to shut it off, because it will be pushing against even stronger economic incentives. And that's even before factoring in an AI that makes any plans for self-preservation (which current AIs do not).

▲

pjc50 2 hours ago | parent | prev | next [-]

> There isnt going to be a HAL or Terminator style situation

The threat isn't HAL, but ICE. Not AI as some sort of unique evil, but as a force multiplier for extremely human - indeed, popular - forms of evil. I'm sure someone will import the Chinese idea of the ethnicity-identifying security camera, for example.

▲

ben_w an hour ago | parent | prev | next [-]

> We are far far away from a sort of world where turning AI off is a problem. There isnt going to be a HAL or Terminator style situation when the world is still "I, Pencil".

You have to stop the thing before the damage is done.

There are many potential chains of events where the AI has caused enormous damage, and even many where it can destroy us, before the power to its own systems fails.

At this point, with Grok in the Pentagon, just ask what the dumbest military equivalent to vibe-coding is, and imagine the US following that plan.

Like, I dunno, invading Greenland or giving ICE direct control over tactical nukes or something.

And that's just government use. Right now, I'm fairly confident LLMs aren't competent enough to help with anything world-ending unless they get used for war planning by major nuclear powers (oh hey look at the topic of discussion), but it's certainly plausible they'll get good enough at tool use to run someone else's protein folding software etc. to design custom pathogens, and I really hope all the DNA printing companies have good multi-layer defences (all the way from KYC or similar to analysing what they've been asked to make and content-filtering it) by that point.

▲

blibble 9 hours ago | parent | prev | next [-]

the problem situation is that it ends up embedded in so much that it can't be turned off

and the idiots are racing to that situation as fast as they possibly can

▲

TacticalCoder 10 hours ago | parent | prev [-]

> There isnt going to be a HAL or Terminator style situation ...

I don't believe for a second we'll have an evil AI. However I do believe it's very likely we may rely on AI slop so much that we'll have countless outages with "nobody knowing how to turn the mediocrity off".

The risk ain't "super-intelligent evil AI": the risk is idiots putting even more idiotic things in charge.

And I'm no luddite: I use models daily.

▲

baq 7 hours ago | parent | next [-]

> I don't believe for a second we'll have an evil AI.

Doesn’t have to be evil to be disastrous. Misaligned is plenty enough.

https://en.wikipedia.org/wiki/Instrumental_convergence

▲

esafak 9 hours ago | parent | prev [-]

Didn't you read the news about the 'claw that blackmailed an open source maintainer last week? It was autonomous, but it could be turned off. How hard is it to extrapolate from that to an agent that worms its way out of its sandbox?

	▲	tsimionescu 7 hours ago \| parent [-]
		What makes you think that was an autonomous agent, and not someone playing with AI?

▲

ReptileMan 9 hours ago | parent | prev [-]

Censoring models is not safety but safetizm. It is the TSA of the AI world. Safety is making sure the model cannot do anything not allowed even if it wants to.

▲

andsoitis 25 minutes ago | parent | prev | next [-]

The race is on for military supremacy in an AI world. The safest thing to do is to race ahead lest your geopolitical adversary leads the way. This is similar to the nuclear arms race. In the ideal universe, nobody does it, but in the real world and game theory, you do not have a choice.

▲

ozgung 3 hours ago | parent | prev | next [-]

This proves:

1. AI is military/surveillance technology in essence, like many other information technologies,

2. Any guarantee given by AI companies is void since it can be changed in a day,

3. Tech companies have no real control over how their technology will be used,

4. AI companies may seem over-valued with low profits if you think AI as a civil technology. But their investors probably see them as a part of defense (war) industry.

	▲	high_na_euv 3 hours ago \| parent [-]
		>Any guarantee given by AI companies is void since it can be changed in a day, Given by anyone, actually.

▲

hedayet 6 hours ago | parent | prev | next [-]

Developments like this make me less interested in building a "successful" tech company.

It increasingly feels like operating at that scale can require compromises I’m not comfortable making. Maybe that’s a personal limitation—but it’s one I’m choosing to keep.

I’d genuinely love to hear examples of tech companies that have scaled without losing their ethical footing. I could use the inspiration.

▲

johanneskanybal 6 hours ago | parent | next [-]

Maybe this is a weird arena to state the obvious. But you don't need to build a multi-billion vc/public company. Build a smaller revenue generating company without outside funding and it's up to you.

	▲	hedayet 4 hours ago \| parent [-]
		I get your point. The dilemma is whether to build something small that no one would bother compete against, or build something novel (which all of us want) but then risk someone with VC funding to come after. That being said, I think I need to learn more about how to build smaller revenue generating good companies.

▲

apothegm 2 hours ago | parent | prev [-]

If you want to be able to retain ethics, among other things make sure not to take the company public. Then you’re basically legally required to drop ethics in favor of profits.

Also don’t take investment from anyone who isn’t fully aligned ethically. Be skeptical of promises from people you don’t personally know extremely well.

That may limit you to slower growth, or cap your growth (fine if you want to run a company and take home $2M/ye from it; not fine if you want to be acquired for $100M and retire.) It may also limit you to taking out loans to fund growth that you can’t bootstrap to, which is a different kind of risky.

▲

daft_pink an hour ago | parent | prev | next [-]

I think the US Gov’t is basically forcing them and while it sounds nice to be all safe… If we were involved in WW3 would an organization like anthropic really not support the western side?

▲

ifwinterco 4 hours ago | parent | prev | next [-]

The whole "safety" debate was always nonsense and I'm not sure how so many people got caught up in it.

The US is not the only country in the world so the idea that humanity as a whole could somehow regulate this process seemed silly to me.

Even if you got the whole US tech community and the US government on board, there are 6.7bn other people in the world working in unrelated systems, enough of whom are very smart

▲

zaphirplane 4 hours ago | parent [-]

When the leading 5 models are from the US then yes enforced safety makes a difference because they are ahead of the curve. Now when the 10th model can be a danger then your case is true.

What would safety applied to the leading 3 mean to you anyways ?

	▲	ifwinterco 13 minutes ago \| parent [-]
		Even if US labs are currently in the lead (which they are), in the hypothetical scenario where we're close to AGI, it wouldn't take too long (years - decades at most) for other people to catch up, especially given a lot of the researchers etc. are not originally from the US. So the stated concern of the west coast tech bros that we're close to some misaligned AGI apocalypse would be slightly delayed, but in the grand scheme of things it would make no difference

▲

Fervicus an hour ago | parent | prev | next [-]

To me this feels like a marketing gimmick. "It was the RSP that was constraining our tech. Just see the progress we can make without it now". And the hype and funding continues.

	▲	hsuduebc2 38 minutes ago \| parent [-]
		That will be nice but I'm afraid it's more about using these to kill people. https://apnews.com/article/anthropic-hegseth-ai-pentagon-mil...

▲

jedberg 7 hours ago | parent | prev | next [-]

I don’t blame anthropic here. The government literally threatened their existence publicly. They either agreed or their business would be nationalized.

▲

sonofhans 6 hours ago | parent | next [-]

No, they either agreed or fought the government. You’re allowed to fight governments. Mahatma Gandhi and Reverend King Jr did it, and they wrote about how to do it. You might lose sometimes, but my god, you can at least fight.

▲

consp 5 hours ago | parent | next [-]

Neither of them had shareholders to please.

▲

sega_sai 3 hours ago | parent | next [-]

I don't believe anthropic has shareholders either. It is not a public company

	▲	sebastiennight 2 hours ago \| parent \| next [-]
		If you take investments, your investors will most likely own shares of the company (except in specific early-stage scenarios like YC's SAFE). Sometimes major investors will have board seats or voting shares. This happens in normal private companies, not just public ones.
	▲	cube00 2 hours ago \| parent \| prev [-]
		Still has private investors it can't ignore, until it can buy them out, but it can't do that until it starts turning over a profit. Even then it may not be able to get rid of them if they own enough of a share.

▲

smartbit 4 hours ago | parent | prev [-]

They had citizens to please and society to take care of.

▲

delaminator 3 hours ago | parent | prev [-]

They were both pushing on open doors

▲

helloplanets 5 hours ago | parent | prev | next [-]

It's not like that happened out of the blue. (Which could've also been the case in today's day and age.) Anthropic shouldn't have gotten involved in government contracts to begin with.

They inserted themselves into the supply chain, and then the government told them that they'll be classified as a supply chain risk unless they get unfettered access to the tech. They knew what they were getting into, but didn't want the competitors to get their slice of the pie.

The government didn't pursue them, Anthropic actively pursued government and defense work.

Talk about selling out. Dario's starting to feel more and more like a swindler, by the day.

▲

johnbellone 3 hours ago | parent | prev | next [-]

Pepperidge farm remembers when they left OpenAI due to their principles. Perhaps that was never the case.

Public benefit corporation, hm?

▲

XorNot 7 hours ago | parent | prev [-]

Lotta just following orders going around in the US right now.

▲

jedberg 7 hours ago | parent [-]

This isn’t just following orders. This was the government using its might to force a business to do what it wants.

This should concern you.

	▲	baq 7 hours ago \| parent \| next [-]
		Today’s bingo: 1. Powerful, often exclusionary, populist nationalism centered on cult of a redemptive, “infallible” leader who never admits mistakes. 2. Political power derived from questioning reality, endorsing myth and rage, and promoting lies. 3. Fixation with perceived national decline, humiliation, or victimhood. 4. Oppose any initiatives or institutions that are racially, ethnically, or religiously harmonious. 5. Disdain for human rights while seeking purity and cleansing for those they define as part of the nation. 6. Identification of “enemies”/scapegoats as a unifying cause. Imprison and/or murder opposition and minority group leaders. 7. Supremacy of the military and embrace of paramilitarism in an uneasy, but effective collaboration with traditional elites. Government arms people and justifies and glorifies violence as “redemptive”. 8. Rampant sexism. 9. Control of mass media and undermining “truth”. 10. Obsession with national security, crime and punishment, and fostering a sense of the nation under attack. 11. Religion and government are intertwined. 12. Corporate power is protected and labor power is suppressed. 13. Disdain for intellectuals and the arts not aligned with the narrative. 14. Rampant cronyism and corruption. Loyalty to the leader is paramount and often more important than competence. 15. Fraudulent elections and creation of a one-party state. 16. Often seeking to expand territory through armed conflict.
	▲	toolazytologin 5 hours ago \| parent \| prev \| next [-]
		How is that not “just following orders”? All orders from up the chain come with an implied “or else my might comes down on you”. Most people do the right thing when it’s easy and profitable. Having ethics means doing the right thing even when it’s difficult.
	▲	apothegm 2 hours ago \| parent \| prev \| next [-]
		Two sides of the same filthy coin, in a way.
	▲	ReptileMan 5 hours ago \| parent \| prev [-]
		>This isn’t just following orders. This was the government using its might to force a business to do what it wants. You are saying it like it is something new or extraordinary. Wickard_v._Filburn gave the USG the power to bitch slap anyone unless it falls under some of the other amendments. And not as if they were not substantially weakened.

▲

haritha-j 2 hours ago | parent | prev | next [-]

Who could've seen that one coming? Honestly, if you want to do profit maximising AI research at the cost of humanity, go for it. Its all this fake preaching about how they want to save the world from all the other bad AI companies that really irks me.

▲

goranmoomin 11 hours ago | parent | prev | next [-]

TBH I am sad that Anthropic is changing its stance, but in the current world, if you even care about LLM safety, I feel that this is the right choice — there’s too many model providers and they probably don’t consider safety as high priority as Anthropic. (Yes that might change, they can get pressurized by the govt, yada yada, but they literally created their own company because of AI safety, I do think they actually care for now)

If we need safety, we need Anthropic to be not too far behind (at least for now, before Anthropic possibly becomes evil), and that might mean releasing models that are safer and more steerable than others (even if, unfortunately, they are not 100% up to Anthropic’s goals)

Dogmatism, while great, has its time and place, and with a thousand bad actors in the LLM space, pragmatism wins better.

▲

ashtonshears 10 hours ago | parent | next [-]

Do you work at Anthropic, or know people who do?

I genuinly curious why they are so holy to you, when to me I see just another tech company trying to make cash

Edit: Reading some of the linked articles, I can see how Anthropic CEO is refusing to allow their product for warfare (killing humans), which is probably a good thing that resonates with supporting them

▲

dannersy 6 hours ago | parent | next [-]

Let us not pretend that they won't be used for war eventually. If they cave immediately under pressure, then this is an inevitably.

▲

nradov 9 hours ago | parent | prev [-]

How is it a good thing to refuse to provide our warfighters with the tools that they need? I mean if we're going to have a military at all then we owe it to them to give them the best possible weapons systems that minimize friendly casualties. And let's not have any specious claims that LLMs are somehow special or uniquely dangerous: the US military has deployed operational fully autonomous weapons systems since the 1970s.

▲

yunwal 8 hours ago | parent | next [-]

This is the US military we’re talking about so 95% of what they do is attacking people for oil. They don’t “need” more of anything, they’re funded to the tune of a trillion dollars a year, almost as much as every other military in the world combined. What holy mission do you think they’re going to carry out with the assistance of LLMs?

▲

nradov 7 hours ago | parent [-]

That's a total non sequitur. If you think the military is being tasked with the wrong missions, or too many missions, then take that up with the civilian political leadership. But it's not a valid reason to deny the warfighters the best possible weapons systems.

Personally I favor a less interventionist foreign policy. But that change can only come about through the political process, not by unaccountable corporate employees making arbitrary decisions about how certain products can be used.

▲

ahtihn 6 hours ago | parent | next [-]

> But it's not a valid reason to deny the warfighters the best possible weapons systems.

Of course it is.

Think about it this way: if you could guarantee that the military suffers no human losses when attacking a foreign country, do you think that's going to more or less foreign interventions?

The tools available to the military influence policy, these things are linked.

US military is already overwhelmingly powerful, there's 0 reason to make it even more powerful.

	▲	nradov 12 minutes ago \| parent [-]
		That's so delusional. The US military is currently preparing for a potential conflict with China to stop an invasion with Taiwan. They don't have anything near "overwhelming force" for that mission: recent simulations put it about even at best. People who believe they don't need any improved autonomous weapons are simply uninformed.

▲

johnmaguire 6 hours ago | parent | prev [-]

> If you think the military is being tasked with the wrong missions, or too many missions, then take that up with the civilian political leadership. But it's not a valid reason to deny the warfighters the best possible weapons systems.

It is an ethical dilemma: believing an armed force will act unethically is in fact a valid reason to refuse to arm them. You are taking a nationalistic view regarding the worth of life.

And if you believe it is unethical to arm them, it is rational to use whatever leverage you have available to you - such as refusing to sell your company's product.

Furthermore, one of the two points at issue was regarding surveiling civilians.

▲

chris_wot 8 hours ago | parent | prev | next [-]

"How is it a good thing to refuse to provide our warfighters with the tools that they need?"

Perhaps you should consider that this is a loaded question. I don't think HN needs this sort of Argumentum ad Passiones.

▲

nozzlegear 8 hours ago | parent | prev [-]

Why are you asking this question? You know what the answer is, you've just arbitrarily decided that it's specious in an attempt to frame rebuttals as unreasonable.

	▲	nradov 7 hours ago \| parent [-]
		I'm open to reasonable rebuttals but all the rebuttals that I've seen so far are simply uninformed.

▲

saghm 10 hours ago | parent | prev | next [-]

> If we need safety, we need Anthropic to be not too far behind (at least for now, before Anthropic possibly becomes evil)

I don't think it's going to be as easy to tell as you think that they might be becoming evil before it's too late if this doesn't seem to raise any alarm bells to you that this is already their plan

▲

salawat 6 hours ago | parent | prev [-]

The world would be so much nicer if there were just fewer pragmatists shitting up the place for everyone. We might actually handle half our externalities.

▲

haritha-j 2 hours ago | parent | prev | next [-]

Is it time yet to build the next "Hey <anthropic> is evil now, here's my new startup that definitely won't be evil, pinky promise?" yet?

▲

kristopolous 6 hours ago | parent | prev | next [-]

Wish I was working there so I could resign over this

▲

esafak 11 hours ago | parent | prev | next [-]

It must be due to pressure from the Defense Dept:

The AI startup has refused to remove safeguards that would prevent its technology from being used to target weapons autonomously and conduct U.S. domestic surveillance.

Pentagon officials have argued the government should only be required to comply with U.S. law. During the meeting, Hegseth delivered an ultimatum to Anthropic: get on board or the government would take drastic action, people familiar with the matter said.

https://www.staradvertiser.com/2026/02/24/breaking-news/anth...

▲

instagib 10 hours ago | parent [-]

They probably have proof in contracts that they agreed to this usage. They won’t alter the deal based on some bad press nor do they want to lose the DoD-DoW as a customer.

	▲	alpha_squared 7 hours ago \| parent [-]
		From what I was reading, it appears that their tools were used outside the scope of their contract with DoD via Palantir's work that also used Claude. Anthropic freaked out, DoD freaked out that Anthropic freaked out and threatened to declare them a supply chain risk. That designation would've required any company that contracts with DoD to strip out any Anthropic tooling from their business in order to continue working with DoD. It was effectively designating Anthropic a terrorist organization.

▲

contubernio 5 hours ago | parent | prev | next [-]

Only well written legislation backed by effective enforcement and severe and personal criminal penalties will prevent large corporate entities from behaving badly.

Pledges are a cynical marketing strategy aimed at fomenting a base politics that works to prevent such a regulatory regime.

▲

jjgreen an hour ago | parent | prev | next [-]

Misanthropic then.

▲

mhitza 11 hours ago | parent | prev | next [-]

The IPOs this year can't come soon enough https://tomtunguz.com/spacex-openai-anthropic-ipo-2026/

▲

joshribakoff 5 hours ago | parent | prev | next [-]

Dario’s opinion on safety won’t necessarily matter if he’s not even in the room. This move keeps him in the room.

▲

saidnooneever 5 hours ago | parent | prev | next [-]

safety pledges are great it times of peace to show what great virtues you hold. sadly in hard times these go out of the window (: hard to blame them with all the fine examples around the world.

making promises in good times is a real minefield hah

▲

agentifysh 8 hours ago | parent | prev | next [-]

Was this because they were threatened with a fine?

	▲	alpha_squared 7 hours ago \| parent [-]
		> Was this because they were threatened with ~a fine~ being designated a supply chain risk? Seems like it, yes.

▲

Art9681 10 hours ago | parent | prev | next [-]

Of course the US is going to do this and of course its in Anthropics best interest to comply. Right now China is flooding HuggingFace with models that will inevitably have this capability. Right now there are hundreds of models being hosted that have been deliberately processed to remove refusals and their safety training. Everyone who keeps up with this knows about it. HF knows about it. And it is pretty obvious that those open weight models will be deployed in intelligence and defense. It is certain that not just China, but many nations around the world with the capital to host a few powerful servers to run the top open weight models are going to use them for that capability.

The narrative on social media, this site included, is to portray the closed western labs as the bad guys and the less capable labs releasing their distilled open weight models to the world as the good guys.

Right now a kid can go download an Abliterated version of a capable open weight model and they can go wild with it.

But let's worry about what the US DoD is doing or what the western AI companies absolutely dominating the market are doing because that's what drives engagement and clicks.

	▲	ddxv 7 hours ago \| parent \| next [-]
		> Right now a kid can go download an Abliterated version of a capable open weight model and they can go wild with it. Is the reason to ban or block free open weight models that you're worried what kids will do with them? I'd imagine the economic case to be made is that the Western AI companies will ultimately not be able to compete with free open weight models. Additionally, open weight models will help to spread the economic gains by not letting a few monopolies capture them behind regulatory red tape. Finally, I'd say the geopolitics angle of why open weight models are better is that if the West controls the open source software that will power it will be able to reap the benefits that soft power brings with it.
	▲	EagnaIonat 6 hours ago \| parent \| prev [-]
		> But let's worry about what the US DoD is doing They want Anthropic to enabling mass surveillance and autonomous attack systems with no human in the loop. Hardly compares to a kid downloading a model to experiment with.

▲

ggsp 12 hours ago | parent | prev | next [-]

It was always a matter of time

▲

amelius an hour ago | parent | prev | next [-]

Come on people, haven't we seen enough of capitalism to know exactly where this is going?

The concept of "having a contract with society" doesn't even formally exist because companies would never sign one.

▲

lerp-io 5 hours ago | parent | prev | next [-]

pentagon told them they would cap their knees if they didnt bend

▲

thefounder 9 hours ago | parent | prev | next [-]

So much BS from this Anthropic company. They have a good product but just too much slope PR. It’s like they want you to hate them. I can’t stand their “safety” and national security crap when they talk about how open source models are so bad for everyone.

▲

kitsune_ 6 hours ago | parent | prev | next [-]

C.R.E.A.M.

▲

bravetraveler 3 hours ago | parent | prev | next [-]

A dollar will make her holler

▲

nhinck3 5 hours ago | parent | prev | next [-]

Just another drop in the now overflowing bucket of evidence that you can't trust any of these immoral fuck wits.

The Amodeis' have just proven that the threat of even slight hardship will make them throw any and all principles away.

▲

dhruv3006 12 hours ago | parent | prev | next [-]

Anthropic facing a lot of flak recently.

▲

jimmydoe 11 hours ago | parent | prev | next [-]

Either be a company in capitalist USA, or keep being your safety queen. You just can’t be both.

The intention to start these pledge and conflict with DOW might be sincere, but I don’t expect it to last long, especially the company is going public very soon.

▲

tbrownaw 10 hours ago | parent | prev | next [-]

> committed to never train an AI system unless it could guarantee in advance that the company’s safety measures were adequate

That doesn't even make sense.

What stops one model from spouting wrongthink and suicide HOWTOs might not work for a different model, and fine-tuning things away uses the base model as a starting point.

You don't know the thing's failure modes until you've characterized it, and for LLMs the way you do that is by first training it and then exercising it.

▲

ChrisArchitect 11 hours ago | parent | prev | next [-]

Hegseth gives Anthropic until Friday to back down on AI safeguards

https://news.ycombinator.com/item?id=47140734

https://news.ycombinator.com/item?id=47142587

	▲	EagnaIonat 6 hours ago \| parent \| next [-]
		It's part of the overall story. The safeguards dropped are when they will release a model or not based on safety. The Friday deadline is to allow to use their products for mass surveillance and autonomous weapons systems without a human in the loop. Anthropic hasn't backed down on those, yet. But they are in a bad situation either way. If they don't back down, they lose US government contracts, the government gets to do what it wants anyway. It also puts them in a dangerous position with non-governmental bodies. If they give into the demands, then it puts all AI companies at risk of the same thing. Personally I think they should move to the EU. The recent EU laws align with Anthropics thinking.
	▲	dbg31415 10 hours ago \| parent \| prev [-]
		They made it until Tuesday! They stood tall as long as they could! =P

▲

BoredPositron 5 hours ago | parent | prev | next [-]

Anthropic and OpenAI really need a margin call from some obscure unknown Chinese Open Weight Model.

▲

energy123 6 hours ago | parent | prev | next [-]

I blame OpenAI and especially xAI for enthusiastically obeying in advance and creating the context that this dilemma for Anthropic arose in.

▲

VerifiedReports 4 hours ago | parent | prev | next [-]

Just like OpenAI dropped the "open" but kept the bullshit name?

	▲	johnbellone 3 hours ago \| parent [-]
		Ding ding!

▲

pjmlp 7 hours ago | parent | prev | next [-]

Another example how those company trainings about ethics are only HR compliancy and nothing else.

It isn't about the right answers, rather the expected answers.

▲

crossroadsguy 11 hours ago | parent | prev | next [-]

I just want Apple and Linux to offer ASAP:

1. Extremely granular ways to let user control network and disk access to apps (great if resource access can also be changed)

2. Make it easier for apps as well to work with these

3. I would be interested in knowing how adding a layer before CLI/web even gets the query OS/browser can intercept it and could there be a possibility of preventing harm before hand or at least warning or logging for say someone who overviews those queries later?

And most importantly — all these via an excellent GUI with clear demarcations and settings and we’ll documented (Apple might struggle with documentation; so LLMs might help them there)

My point is — why the hell are we waiting for these companies to be good folks? Why not push them behind a safety layer?

I mean CLI asks .. can I access this folder? Run this program? Download this? But they can just do that if they want! Make them ask those questions like apps asks on phones for location, mic, camera access.

	▲	dlt713705 4 hours ago \| parent \| next [-]
		> I mean CLI asks .. can I access this folder? Run this program? Download this? But they can just do that if they want! Make them ask those questions like apps asks on phones for location, mic, camera access. Basicaly an EDR
	▲	m132 10 hours ago \| parent \| prev [-]
		Indeed, the world would be a much nicer place if only firewalls and Unix permissions existed...

▲

InfinityByTen 5 hours ago | parent | prev | next [-]

So, now it's mis-anthropic?

▲

Havoc 5 hours ago | parent | prev | next [-]

Safety pledges these days seem like pure bullshit anyway.

They’re pointless if they just get removed once you get close to hitting them.

And all the major corps seem to be doing this style of pr management. Speaks of some pretty weapons grade moral bankruptcy

▲

rvz 10 hours ago | parent | prev | next [-]

Unsurprising.

▲

aspectmin 8 hours ago | parent | prev | next [-]

Really - each country needs its own sovereign AI infrastructure and models. Sigh.

▲

brikym 10 hours ago | parent | prev | next [-]

Don't be evil.

	▲	Duanemclemore 9 hours ago \| parent [-]
		Yeah, in retrospect that was always a little on the nose, wasn't it? A real 'my t-shirt is raising questions that I thought were answered by the shirt' kind of deal.

▲

ur-whale 9 hours ago | parent | prev | next [-]

At some point, all of these big names in AI (OpenAI, Anthropic, Mistral, etc ...) will have to disclose their actual financials.

And it will be, as Warren Buffet puts it, a "Only when the tide goes out do you discover who's been swimming naked." moment.

▲

tolmasky 9 hours ago | parent | prev | next [-]

I don't understand how safety is taken seriously at all. To be clear, I'm not referring to skepticism that these companies can possibly resist the temptation to make unsafe models forever. No, I'm talking about something far more basic: the fact that for all the talk around safety, there is very little discussion about what exactly "safety" means or what constitutes "ethical" or "aligned" behavior. I've read reams of documents from Anthropic around their "approach to safety". The "Responsible Scaling Policy," Claude's "Constitution". The "AI Safety Level" framework. Layer 1, Layer 2.

It's so much focus on implementation, and processes, and really really seems to consider the question of what even constitutes "misaligned" or "unethical" behavior to be more or less straight forward, uncontroversial, and basically universally agreed upon?

Let's be clear: Humans are not aligned. In fact, humans have not come to a common agreement of what it means to be aligned. Look around, the same actions are considered virtuous by some and villainous by others. Before we get to whether or not I trust Anthropic to stick to their self-imposed processes, I'd like to have a general idea of what their values even are. Perhaps they've made something they see as super ethical that I find completely unethical. Who knows. The most concrete stances they take in their "Constitution" are still laughably ambiguous. For example, they say that Claude takes into account how many people are affected if an action is potentially harmful. They also say that Claude values "Protection of vulnerable groups." These two statements trivially lead to completely opposing conclusions in our own population depending on whether one considers the "unborn" to be a "vulnerable group". Don't get caught up in whether you believe this or not, simply realize that this very simple question changes the meaning of these principles entirely. It is not sufficient to simply say "Claude is neutral on the issue of abortion." For starters, it is almost certainly not true. You can probably construct a question that is necessarily causally connected to the number of unborn children affected, and Claude's answer will reveal it's "hidden preference." What would true neutrality even mean here anyways? If I ask it for help driving my sister to a neighboring state should it interrogate me to see if I am trying to help her get to a state where abortion is legal? Again, notice that both helping me and refusing to help me could anger a not insignificant portion of the population.

This Pentagon thing has gotten everyone riled up recently, but I don't understand why people weren't up in arms the second they found out AIs were assisting congresspeople in writing bills. Not all questions of ethics are as straight forward as whether or not Claude should help the Pentagon bomb a country.

Consider the following when you think about more and more legislation being AI-assisted going forward, and then really ask yourself whether "AI alignment" was ever a thing:

1. What is Claude's stances on labor issues? Does it lean pro or anti-union? Is there an ethical issue with Claude helping a legislator craft legislation that weakens collective bargaining? Or, alternatively, is it ethical for Claude to help draft legislation that protects unions?

2. What is Claude's stance on climate change? Is it ethical for Claude to help craft legislation that weakens environmental regulations? What if weakening those regulations arguably creates millions of jobs?

3. What is Claude's stance on taxes? Is it ethical for Claude to help craft legislation that makes the tax system less progressive? If it helps you argue for a flat tax? How about more progressive? Where does Claude stand on California's infamous Prop 19? If this seems too in the weeds, then that would imply that whether or not the current generation can manage to own a home in the most populous state in the US is not an issue that "affects enough people." If that's the case, then what is?

4. Where does Claude land on the question of capitalism vs. socialism? Should healthcare be provided by the state? How about to undocumented immigrants? In fact, how does Claude feel about a path to amnesty, or just immigration in general?

Remember, the important thing here is not what you believe about the above questions, but rather the fact that Claude is participating in those arguments, and increasingly so. Many of these questions will impact far more people than overt military action. And this is for questions that we all at least generally agree have some ethical impact, even if we don't necessarily agree on what that impact may be. There is another class of questions where we don't realize the ethical implications until much later. Knowing what we know now, if Claude had existed 20 years ago, should it have helped code up social networks? How about social games? A large portion of the population has seemingly reached the conclusion that this is such an important ethical question that it merits one of the largest regulation increases the internet has ever seen in order to prevent children from using social media altogether. If Claude had assisted in the creation of those services, would we judge it as having failed its mission in retrospect? Or would that have been too harsh and unfair a conclusion? But what's the alternative, saying it's OK if the AI's destroy society... as long as if it's only on accident?

What use is a super intelligence if it's ultimately as bad at predicting unintended negative consequences as we are?

	▲	boilerupnc 2 hours ago \| parent \| next [-]
		Related: [0]. https://civai.org/p/ai-values
	▲	EagnaIonat 7 hours ago \| parent \| prev \| next [-]
		I would recommend reading up on the EU AI Act. It clearly defines what safety is in regards to the human race. Your questions are actually covered by it.
	▲	Noaidi 3 hours ago \| parent \| prev [-]
		Hey Tolmasky, I sent you an email. Just wondering if it went to your spam? Also, agree with everything you say here. GIGO.

▲

SilverElfin 11 hours ago | parent | prev [-]

This is terrible. It’s caving in to the Trump administration threatening to ban Anthropic from government contracts. It really cements how authoritarian this administration is and how dangerous they can be.