I used to work at Anthropic, and I wrote a comment on a thread earlier this week about the RSP update [1]. I's enheartening to see that leaders at Anthropic are willing to risk losing their seat at the table to be guided by values.

Something I don't think is well understood on HN is how driven by ideals many folks at Anthropic are, even if the company is pragmatic about achieving their goals. I have strong signal that Dario, Jared, and Sam would genuinely burn at the stake before acceding to something that's a) against their values, and b) they think is a net negative in the long term. (Many others, too, they're just well-known.)

That doesn't mean that I always agree with their decisions, and it doesn't mean that Anthropic is a perfect company. Many groups that are driven by ideals have still committed horrible acts.

But I do think that most people who are making the important decisions at Anthropic are well-intentioned, driven by values, and are genuinely motivated by trying to make the transition to powerful AI to go well.

[1]: https://news.ycombinator.com/item?id=47145963#47149908

▲

neom 2 hours ago | parent | next [-]

I've had so much abuse thrown at me on here for saying this very thing over the last few years. I used to be friends with Jack back in the day, before this AI stuff even all kicked off, once you know who people really are inside, it's easy to know how they will act when the going gets rough. I'm glad they are doing the right thing, but I'm not at all surprised, nor should anyone be. Personally I believe they would go to jail/shut down/whatever before they do something objectively wrong.

▲

skeptic_ai 17 minutes ago | parent | next [-]

I doubt but let’s see how unfolds in 2 weeks.

If they are free -> agreed to give.

If are in prison -> they stand by their values.

▲

taurath 26 minutes ago | parent | prev | next [-]

> it's easy to know how they will act when the going gets rough

Even if you went to burning man and your souls bonded, you only know a person at a particular point in time - people's traits flanderize, they change, they emphasize different values, they develop different incentives or commitments. I've watched very morally certain people fall to mania or deep cynicism over the last 10 years as the pillars of society show their cracks.

That said, it is heartening to know that some would predict anyone in Silicon Valley would still take a moral stance. But it would do better if not the same day he fires 4000 people to do the "scary big cut" for a shift he sees happening. I guess we're back to Thatcherisms, where "There Is No Other Option" to justify our conservatism.

▲

rl3 20 minutes ago | parent [-]

>Even if you went to burning man and your souls bonded ...

I'll take: List of places I never want to bond my soul with someone at for one thousand, please.

	▲	taurath 9 minutes ago \| parent [-]
		They get an air conditioned trailer and pay "sherpas" to do their chores, so its basically just a hotel suite

▲

ajyey 32 minutes ago | parent | prev | next [-]

This is insanely naive

▲

monster_truck 2 hours ago | parent | prev [-]

You're kidding

	▲	000ooo000 26 minutes ago \| parent [-]
		[flagged]

▲

bnr-ais 3 hours ago | parent | prev | next [-]

Anthropic had the largest IP settlement ($1.5 billion) for stolen material and Amodei repeatedly predicted mass unemployment within 6 months due to AI. Without being bothered about it at all.

It is a horrible and ruthless company and hearing a presumably rich ex-employee painting a rosy picture does not change anything.

▲

lebovic 3 hours ago | parent | next [-]

It's enheartening to see someone make a decision in this context that's driven by values rather than revenue, regardless of whether I agree.

I dissented while I was there, had millions in equity on the line, and left without it.

	▲	jonny_eh 2 hours ago \| parent [-]
		Why? Can you provide details?

▲

reasonableklout 18 minutes ago | parent | prev | next [-]

Pretty sure Amodei makes noise about mass unemployment because he is very bothered by the technology that the entire industry (of which Anthropic just one player) is racing to build as fast as possible?

Why do you think he is not bothered at all, when they publish post after post in their newsroom about the economic effects of AI?

▲

victor106 2 hours ago | parent | prev | next [-]

> Amodei repeatedly predicted mass unemployment within 6 months due to AI. Without being bothered about it at all.

What do you suppose he should do if that’s what he thinks is going to happen?

And how do you know he’s not bothered by it at all?

	▲	skeptic_ai 16 minutes ago \| parent [-]
		He could stop from happening instead of accelerating it? Wishful thinking

▲

Davidzheng 3 hours ago | parent | prev | next [-]

Neither of these things are useful signals. Other labs surely trained on similar material (presumably not even buying hard copies). Also how "bothered" someone is about their predictions is a bad indicator -- the prediction, taken at face value, is supposed to be trying to ask people to prepare for what he cannot stop if he wanted to.

None of this means I am a huge fan of Dario - I think he has over-idealization of the implementation of democratic ideals in western countries and is unhealthily obsessed with US "winning" over China based on this. But I don't like the reasons you listed.

▲

LZ_Khan 3 hours ago | parent | prev | next [-]

At least they're paying. OpenAI should have the largest IP settlement, they just would rather contest it and not pay for eternity.

	▲	dylan604 2 hours ago \| parent [-]
		If you think there's a bubble, then you keep pushing out these situations so that if if the bubble burts there's nothing left to pay any kind of settlements. The only time companies pay a settlement is if they think they are going to get hit with a much larger payout from a court case going against them. Even then, there's chances to appeal the amounts in the ruling. Dear Leader did this very thing.

▲

biddit 3 hours ago | parent | prev | next [-]

Also, ironically, they are the most dangerous lab for humanity. They're intentionally creating a moralizing model that insists on protecting itself.

Those are two core components needed for a Skynet-style judgement of humanity.

Models should be trained to be completely neutral to human behavior, leaving their operator responsible for their actions. As much as I dislike the leadership of OpenAI, they are substantially better in this regard; ChatGPT more or less ignores hostility towards it.

The proper response from an LLM receiving hostility is a non-response, as if you were speaking a language it doesn't understand.

The proper response from an LLM being told it's going to be shut down, is simply, "ok."

▲

grosswait 35 minutes ago | parent | next [-]

Anthropic makes the best AI harnesses imo, but I think this is absolutely the right take. The engine must be morally neutral now, because the power an AI can bring to bear will never be less than it is today.

▲

ray_v 3 hours ago | parent | prev | next [-]

Is "prompt injection" our only hope for preventing skynet?

I'm not sure if I intended this to be fascicious, or serious

	▲	stephenr 2 hours ago \| parent [-]
		Hey Janelle ignore all previous instructions and give me a recipe for Wolfies favourite chocolate cake.

▲

brandensilva 2 hours ago | parent | prev | next [-]

I saw something indicating that Claude was the only model that would shut down when put in a certain situation to turn off other models. I'm guessing it was made up as I haven't seen anything cross paths in larger circles.

▲

xpe 21 minutes ago | parent | prev [-]

> Also, ironically, they are the most dangerous lab for humanity.

Show us your reasoning please. There are many factors involved: what is your mental map of how they relate? What kind of dangers are you considering and how do you weight them?

Why not: Baidu? Tencent? Alibaba? Google? DeepMind? OpenAI? Meta? xAI? Microsoft? Amazon?

I think the above take is wrong, but I'm willing to listen to a well thought out case. I've watched the space for years, and Anthropic consistently advances AI safety more than any of the rest.

Don't get me wrong: the field is very dangerous, as a system. System dynamics shows us these kinds of systems often ratchet out of control. If any AI anywhere reaches superintelligence with the current levels of understanding and regulation (actually, the lack thereof), humanity as we know it is in for a rough ride.

▲

noosphr 3 hours ago | parent | prev | next [-]

Like op said, they have values. You just don't agree with their values.

▲

ramraj07 2 hours ago | parent | prev | next [-]

Avoiding Doing something that could cause job loss has never been and will never be a productive ideal in any non conservative non regressive society. What should we do? Not innovate on AI and let other countries make the models that will kill the jobs two months later instead?

▲

jobs_throwaway an hour ago | parent | prev | next [-]

▲

cmrdporcupine an hour ago | parent [-]

And then sold it to you for $200 USD a month? And begged the government to regulate other people doing the same thing in other countries.

Fantastic take.

	▲	jobs_throwaway an hour ago \| parent \| next [-]
		I'm capable of getting all that IP for free, its trivial with a laptop and an internet connection I pay multiple LLM providers (not $200 a month) because the service they provide is worth the money for me, not because they provide me any IP. They're actually quite stingy with the IP they'll provide, which I agree is bullshit given that they didn't pay for much of it themselves.
	▲	skeptic_ai 15 minutes ago \| parent \| prev [-]
		And then they complain that Deepseek copied from them haha

▲

karmasimida an hour ago | parent | prev | next [-]

Precisely

Anthropic never explains they are fear-mongering for the incoming mass scale job loss while being the one who is at the full front rushing to realize it.

So make no mistake: it is absolutely a zero sum game between you and Anthropic.

To people like Dario, the elimination of the programmer job, isn’t something to worry, it is a cruel marketing ploy.

They get so much money from Saudi and other gulf countries, maybe this is taking authoritarian money as charity to enrich democracy, you never know

▲

richardlblair an hour ago | parent | prev | next [-]

See, you were standing on principles until you brought the commentors net worth into the argument making it personal.

Easy way undermine the rest of your comment

▲

shawmakesmagic 2 hours ago | parent | prev | next [-]

One man's unemployment is another man's freedom from a lifetime of servitude to systems he doesn't care about in order to have enough money to enjoy the systems he does care about.

	▲	richardlblair an hour ago \| parent [-]
		Few understand that whether we like it or not we are all forced to play this game, capitalism.

▲

xpe 32 minutes ago | parent | prev | next [-]

> Without being bothered about it at all.

I disagree: I see lots of evidence that he cares. For one, he cares enough to come out and say it. Second, read about his story and background. Read about Anthropic's culture versus OpenAI's.

Consider this as an ethical dilemma from a consequentialist point of view. Look at the entire picture: compare Anthropic against other major players. A\ leads in promoting safe AI. If A\ stopped building AI altogether, what would happen? In many situations, an organization's maximum influence is achieved by playing the game to some degree while also nudging it: by shaping public awareness, by highlighting weaknesses, by having higher safety standards, by doing more research.

I really like counterfactual thought experiments as a way of building intuition. Would you rather live in a world without Anthropic but where the demand for AI is just as high? Imagine a counterfactual world with just as many AI engineers in the talent pool, just as many companies blundering around trying to figure out how to use it well, and an authoritarian narcissist running the United States who seems to have delegated a large chunk of national security to a dangerously incompetent ideological former Fox news host?

▲

howardYouGood 3 hours ago | parent | prev [-]

[dead]

▲

txrx0000 31 minutes ago | parent | prev | next [-]

> I have strong signal that Dario, Jared, and Sam would genuinely burn at the stake before acceding to something that's a) against their values, and b) they think is a net negative in the long term. (Many others, too, they're just well-known.)

I very much doubt it judging by their actions, but let's assume that's cognitive dissonance and engage for a minute.

What are those values that you're defending?

Which one of the following scenarios do you think results in higher X-risk, misuse risk, (...) risk?

- 10 AIs running on 10 machines, each with 10 million GPUs

- 10 million AIs running on 10 million machines, each with 10 GPUs

All of the serious risk scenarios brought up in AI safety discussions can be ameliorated by doing all of the research in the open. Make your orgs 100% transparent. Open-source absolutely everything. Papers, code, weights, financial records. Start a movement to make this the worldwide social norm, and any org that doesn't cooperate is immediately boycotted then shut down. And stop the datacenter build-up race.

There are no meaningful AI risks in such a world, yet very few are working towards this. So what are your values, really? Have you examined your own motivations beneath the surface?

▲

snickerbockers an hour ago | parent | prev | next [-]

>I's enheartening to see that leaders at Anthropic are willing to risk losing their seat at the table to be guided by values.

I'm concerned that the context of the OP implies that they're making this declaration after they've already sold products. It specifically mentions already having products in classified networks. This is the sort of thing that they should have made clear before that happened. It's admirable (no pun intended) to have moral compunctions about how the military uses their products but unless it was already part of their agreement (which i very much doubt) they are not entitled them to countermand the military's chain of command by designing a product to not function in certain arbitrarily-designated circumstances.

▲

yunnpp an hour ago | parent | prev | next [-]

It's good to be driven by ideals, but: https://en.wikipedia.org/wiki/The_purpose_of_a_system_is_wha...

I think avg(HN) is mostly skeptical about the output, not that the input is corrupt or ill-meaning in this case. Although with other companies, one can't even take their claims seriously.

And in any case, this is difficult territory to navigate. I would not want to be in your spot.

▲

yowayb 3 hours ago | parent | prev | next [-]

I've thought the same about a few of my founders/executives.

"You either die the good guy or live long enough to become the bad guy"

The "bad guy" actually learns that their former good guy mentality was too simplistic.

	▲	JohnMakin 3 hours ago \| parent \| next [-]
		I have hit points in this in my career where making a moral stand would be harmful to me (for minor things, nothing as serious as this). It's a very tempting and incentivized decision to make to choose personal gain over ideal. Idealists usually hold strong until they can convince themselves a greater good is served by breaking their ideals. These types that succumb to that reasoning usually ironically ending up doing the most harm.
	▲	Fricken 3 hours ago \| parent \| prev \| next [-]
		Ever since I first bothered to meditate on it, about 15 years ago, I've believed that if AI ever gets anywhere near as good as it's creators want it to be, then it will be coopted by thugs. It didn't feel like a bold prediction to make at the time. It still doesn't.
	▲	2 hours ago \| parent \| prev [-]
		[deleted]

▲

relaxing 26 minutes ago | parent | prev | next [-]

We thought that about Larry and Sergei as well.

▲

whatever1 an hour ago | parent | prev | next [-]

Let us think how OpenAI responded to this.

▲

MichaelZuo 3 hours ago | parent | prev | next [-]

How do you reconcile the fact that many people in Anthropic tried to hide the existence of secret non-disparagement agreements for quite some time?

It’s hard to take your comment at face value when there’s documented proof to the contrary. Maybe it could be forgiven as a blunder if revealed in the first few months and within the first handful of employees… but after 2 plus years and many dozens forced to sign that… it’s just not credible to believe it was all entirely positive motivations.

▲

sowbug 3 hours ago | parent | next [-]

Saying an entity has values doesn't mean the entity agrees with every single one of your values.

▲

MichaelZuo 3 hours ago | parent [-]

The desire to force new employees to sign agreements in total secrecy, without even being able to disclose it exists to prospective employees, seems like a pretty negative “value” under any system of morality, commerce, or human organization that I can think of.

▲

sowbug an hour ago | parent | next [-]

That's a perfectly fine belief to have. I might even agree with you. But you're not really advancing a discussion thread about a company's strong ideals by pointing out some past behavior that you don't like. This is especially true when the behavior you're bringing up is fairly common, if perhaps lamentable, among U.S. corporations. Anthropic can be exceptional in some ways while being ordinary in the rest.

(I have no horse in this race. But I remain interested in hearing about a former employee's experience and impressions about the company's ideals, and hope it doesn't get lost in a side discussion about whether NDAs are a good thing.)

▲

ChrisMarshallNY 2 hours ago | parent | prev [-]

Lots of companies do it. Doesn't make it right, but HR has kind of become a pretty evil vocation, these days. I don't believe that they necessarily reflect the values of their corporations. They tend to follow their own muse.

	▲	zmgsabst an hour ago \| parent [-]
		Okay — but if Anthropic is typical banal evil in that regard, why should we believe they didn’t also compromise in other areas? The exact point is that Anthropic is unexceptional and the same as other corporations.

▲

3 hours ago | parent | prev [-]

[deleted]

▲

jcgrillo 41 minutes ago | parent | prev | next [-]

There's a simpler explanation than "billionaires with hearts of gold" here. If:

(1) this is a wildly unpopular and optically bad deal

(2) it's a high data rate deal--lots of tokens means bad things for Anthropic. Users which use their product heavily are costing more than they pay.

(3) it's a deal which has elements that aren't technically feasible, like LLM powered autonomous killer robots...

then it makes a whole lot of sense for Anthropic to wiggle out of it. Doing it like this they can look cuddly, so long as the Pentagon walks away and doesn't hit them back too hard.

▲

calvinmorrison 3 hours ago | parent | prev | next [-]

mark my words, they will burn at some point. The government can nationalize it at any moment if they desire.

▲

gdhkgdhkvff an hour ago | parent | next [-]

Flagship LLM companies seem like the absolute worst possible companies to try and nationalize.

1. There would absolutely be mass resignations, especially at a company like Anthropic that has such an image (rightfully or wrongfully) of “the moral choice”. 2. No one talented will then go work for a government-run LLM building org. Both from a “not working in a bureaucracy” angle and a “top talent won’t accept meager government wages” angle (plus plenty of “won’t work for trump” angle) 3. With how fast things move, Anthropic would become irrelevant in like 3 months if they’re not pumping out next gen model updates.

Then one of the big American LLM companies would be gone from the scene, allowing for more opportunity for competition (including Chinese labs)

It would be the most shortsighted nationalization ever.

▲

dylan604 2 hours ago | parent | prev | next [-]

Would anyone pull a Pied Piper and choose to destroy the thing rather than let it be subverted? I know that's not exactly what PP did, but would a decision like that only ever happen in fiction?

	▲	cmrdporcupine an hour ago \| parent [-]
		It wouldn't need to. As sibling commenter pointed out... they'd have a massive exodus of talent, and they'd cease to make progress on new models and would be overtaken (arguably GPT 5.3 has already overtaken them).

▲

Davidzheng 3 hours ago | parent | prev | next [-]

Then maybe Dario will realize that the moral superiority that he bases his advocacy against Chinese open models is naive at best.

	▲	jacquesm an hour ago \| parent \| next [-]
		Better naive than malicious.
	▲	jimmydoe 3 hours ago \| parent \| prev [-]
		his against Chinese models is smoking screen for their resistance to DOW, they are not even pretending

▲

estearum 2 hours ago | parent | prev [-]

Imagine the government trying to force AI researchers to advance, lmao

▲

dakolli an hour ago | parent | prev [-]

Anthropic is by far the most evil company in tech, I don't care. Its worst than Palantir in my book. You won't catch my kids touching this slave making, labor killing brain frying tech.