Remix.run Logo
tlb 2 hours ago

Yes, that is basically the plan. It's based on the belief that unfettered AI would let anyone be a supervillain and destroy the world. There are enough would-be supervillains out there, but they rarely get far because they can't get teams of smart people to build doomsday machines for them. So the AI has to not let anyone do evil with it.

Unfortunately, that won't feel very much like freedom.

lebovic an hour ago | parent [-]

It sounds like you might not agree with that belief.

While I don't agree with their actions here, I do think there's sufficient reason to hold that belief.

On some fronts (e.g. security, on which you've experienced more than me), I think there are surmountable challenges. But on other fronts (e.g. bio), a single errant actor could reasonably kill millions or billions of people with sufficiently powerful AI. We don't have good defenses here, and those actors do exist.

I still don't agree with these actions, but I do think I agree with their assumptions.

zozbot234 26 minutes ago | parent [-]

The model release cards for Opus have repeatedly and consistently stressed that the model doesn't have the fiddly know-how that's required to provide meaningful assistance in possibly dangerous subfields of biology. Mythos (Fable without the overly strict guardrails) has shown improvements in things like drug design, but even then the situation isn't really that different. This risk is ridiculously overblown, and the way to manage it sensibly is to introduce meaningful oversight for actors that seek to order the actual specialized materials involved (especially any synthetically generated genes/proteins/whatever).

lebovic a minute ago | parent [-]

No, Anthropic's model cards have claimed that the models don't show considerably more uplift than previous models, which were still already capable. Also, I'd consider those evals a lower bound of capabilities that can be elicited from a model.

I participated in the internal uplift test for Sonnet 3.7, and even then, one non-expert got huge uplift from the model [1].

The team behind Biomni, a biomedical agent that's widely used by researchers, has found consistent gains between models [2]. I trust them, because I visited them to build their HPC tool [3], which the model is quite capable of using – moreso than most grad students. They also care about real usage from real people.

SecureBio is also somewhat public with their evals [4], which have also continued to show increasing uplift.

And while synthesis monitoring is a part of the solution, I think you might underestimate just how much goes under the radar. See the Reedley lab for an example [5].

Is Anthropic still effectively throttling beneficial biomedical research? Yes! And so is OpenAI. But the underlying capability is still actually dual use.

1: See page 25 in https://www-cdn.anthropic.com/9ff93dfa8f445c932415d335c88852... 2: Their benchmark has a preprint at https://www.biorxiv.org/content/10.64898/2026.05.12.724604v1... 3: https://x.com/phylo_bio/article/2029233694775624096 4: https://securebio.org/