Remix.run Logo
satvikpendem 10 hours ago

> Evaluations also show that it has a much lower ability to perform cybersecurity tasks than our current Opus models.

Why would they brag about something like this? It's like they know people want to use models to perform cybersecurity tasks yet knowingly deny them the ability.

And Opus 4.8 is still cheaper for a higher pass rate (much less open weight models like GLM 5.2) so not sure why I'd use Sonnet except on the low effort level for I suppose trivial tasks where I want it to work only 50% of the time judging by the graph. The pricing doesn't really make any sense.

secretslol 9 hours ago | parent | next [-]

"Lower ability to perform cybersecurity-related tasks" makes me super concerned it will leave my codebase like Swiss cheese for any American granny with access to Fable 5, when we non-American Brits, or rest-of-worlders, don't have access to it to clean our codebases.

__alexs 9 hours ago | parent | next [-]

100% this. I read these caveats in new models and all I hear is "we made sure this model has no idea about computer security." Such a weird thing to brag about.

doublescoop 9 hours ago | parent | prev | next [-]

This is code for "this model can't be used to hack other systems as effectively as Opus or Mythos."

kube-system 8 hours ago | parent [-]

"dangerous cyber skills, such as developing software exploits" is very plainly referring to the same thing you are, but is more precise industry terminology rather than the loaded slang "hack".

doublescoop 7 hours ago | parent [-]

I was referring to "Lower ability to perform cybersecurity-related tasks," which is newspeak for hacking.

kube-system 5 hours ago | parent [-]

No, that is very intentionally referring to a broader set of things than "hacking".

matheusmoreira 5 hours ago | parent | prev | next [-]

That's the literal mission of the NSA. Security and strong cryptography for the US while everyone else gets "export grade" nonsense.

cute_boi 9 hours ago | parent | prev | next [-]

I think they don’t understand that cybersecurity skills are what prevent bad code from making it into production.

It’s like telling a chef to cook without a knife because knives can kill people.

Dario and his lackeys at Anthropic aren’t visionaries.

norseboar 9 hours ago | parent | next [-]

I think this is more aimed at the US gov't than anything. They want to be clear that it's not very good at hacking, so that the gov't won't ban it.

I'm sure they're well-aware that this also will make it worse at building secure systems, but the gov't isn't restricting releases based on that.

baq 9 hours ago | parent | prev [-]

I think you misunderstood what their vision is, or rather what their possible futures are. They are many steps ahead of almost everyone, both in wargaming possibilities and the actual realized path. What doesn’t make sense to you may be the only safe option for them.

tancop 8 hours ago | parent | next [-]

> What doesn’t make sense to you may be the only safe option for them

thats true because their point of view makes no sense for us. dario is all in on lesswrong machine god theory and really believes they need to create a super intelligence before anyone else. that means doing as much as possible to slow down others progress and accelerate your own. but the fact that they believe its the only option doesnt make it true for the rest of us.

baq 8 hours ago | parent [-]

Never said otherwise, but it changes nothing. Their beliefs got them to this point on the timeline and that in itself cannot be ignored (or should I say, it should inform our priors...?) You can like or dislike them or what they do or don't do, but you must respect them regardless of that, purely because of their track record.

frabcus 7 hours ago | parent | prev [-]

I've been wondering this - I don't have an intuition for Anthropic's gaming around military applications, or how this stage could play out in terms of relationship to Government controlling AI.

Are there some Less Wrong posts or similar I should read that probably explain it?

Aeolun 5 hours ago | parent | prev | next [-]

I think that increasingly, the US will have to be passed by for these things. Clearly we’ll have to start looking to China for world leadership, to be the land of the free.

kube-system 8 hours ago | parent | prev | next [-]

> any American granny with access to Fable 5,

Fable is effectively not available to the general public in the US either

secretslol 5 hours ago | parent [-]

True, but Trump & Co. did give them permission to let Americans continue using it, but Anthropic turned that down (for now at least...).

kube-system 5 hours ago | parent [-]

Because there is no practical way to comply with what they asked for. They'd have to start validating their users passports.

goalieca 9 hours ago | parent | prev [-]

That’s not even close to true. Unless you’re vibe coding trash that a better model might catch.

secretslol 9 hours ago | parent | next [-]

I don't think so. During the time I was using Fable 5, I was getting it to clean security bugs that Opus 4.8 had introduced ... bugs which weren't localised to a single PHP file but were caused by cascading data flow through multiple PHP files. I'm not an expert on security but I know I wouldn't have found these myself. I knew from day one of Fable's release that it would do thorough security audits and fix loads of flaws, even offering up PoCs to help show that it fixed them, as long as I didn't explicitly ask it to do a security audit. I just said, "My codebase is a mess," and it went on for an hour doing a thorough security audit and helping plug numerous holes. This was before the "fix my code" story came out.

9 hours ago | parent | prev [-]
[deleted]
zlurker 10 hours ago | parent | prev | next [-]

They spent months hyping up Mythos and ended up with it banned. I’d assume they want to both differentiate their products and appeal to regulators here

worldsavior 9 hours ago | parent | next [-]

They will release it eventually. Once they see the Chinese models are close to Mythos level they will release it before, so it will be "revolutionary".

jaapz 9 hours ago | parent [-]

It was already released. US government is the only reason it's not available to us mere mortals anymore

satvikpendem 9 hours ago | parent | next [-]

Due to Dario hyping it up as a world ending model. If they kept their mouths shut we'd all have it now still.

baq 9 hours ago | parent [-]

Where is gpt 5.6?

HDBaseT 5 hours ago | parent | next [-]

GPT 5.6 exists, just not for you and me.

Everyone dislikes when these models are provided for use by the Department of Defense, but we can likely assume these newer, more capable models are being used by the NSA, FBI, CIA and other Five Eyes agencies to develop more backdoors, hack into more things to spy on us all.

We get drip fed the weaker models, but only once all the 0days have been used against us.

081c28a92 8 hours ago | parent | prev | next [-]

Victim of the same hype generated by Dario. Now everyone has to walk on eggshells, do limited releases to trusted partners, and nerf their cybersecurity capabilities lest they get deemed “too powerful to release”.

M3L0NM4N 7 hours ago | parent [-]

Yeah and our government is continuing to take pages from China's playbook for the last fucking decade... and not the plays that work.

satvikpendem 6 hours ago | parent | prev [-]

If not for Dario hyping Mythos and Fable, GPT 5.6 would've released just fine on schedule as a point release without all the fear mongering. It was because Fable was banned that now the government is scrutinizing all models.

worldsavior 7 hours ago | parent | prev [-]

Obviously I meant released for public use.

sixothree 9 hours ago | parent | prev [-]

I'm starting to think it discovered a 0-day held hidden by our government.

noumenon1111 7 hours ago | parent [-]

Oh, it done found like 50 of those

kristianc 9 hours ago | parent | prev | next [-]

There's two classes of models now - the cybersecurity ones that none of us are getting, and the 'safe' models released for general consumption. This is letting us know which side of the divide it sits on.

Taek 9 hours ago | parent | next [-]

There's also Chinese models, which aren't trying to self-limit capabilities.

axus 9 hours ago | parent | next [-]

Surely the Chinese government will see US gov's intervention and say "Government control of business is stupid, our industry will have more independence from CCP control for the benefit of the world".

baq 9 hours ago | parent | prev [-]

…as long as you don’t ask them about certain dates or squares.

Also, I wouldn’t expect Mythos-class models to be allowed to be openly released by the CCP. Thinking otherwise is pure naivety.

girvo 6 hours ago | parent | next [-]

Depends on the model. Step (from StepFun) will happily yap about Tiannemen to you, if you're running it locally.

Quite a lot of these models have "safety" (lol) filters in front of them, vs it being heavily encoded into the weights not.

satvikpendem 6 hours ago | parent | prev | next [-]

Like the sibling said, you can fine tune if the rejections are in the weights but most often it's actually in the API harness itself; download Qwen or DeepSeek and run it locally to ask about certain dates and squares and it will happily tell you.

atemerev 9 hours ago | parent | prev [-]

Well, the weights are open. De-CCP-ing them is a trivial task, about 40 minutes on modern hardware. So can be done for about $50.

bjelkeman-again 6 hours ago | parent [-]

Any good reference for how?

ls612 6 hours ago | parent | next [-]

https://github.com/p-e-w/heretic

atemerev 6 hours ago | parent [-]

Heretic is a general abliterating framework, mostly used to remove safety alignment, not CCP alignment. Yes, you can put China-specific prompts to it, but you'll need a dataset first (which is available at deccp).

Also Heretic as it is does not work for GLM5.2 (at least as of 3 days ago when I tested it). You'll need some hybrid approaches.

atemerev 6 hours ago | parent | prev [-]

https://github.com/AUGMXNT/deccp - one example for Qwen models. For GLM 5.2, abliteration/realignment works somewhat differently, but with Claude's help, you can finish the job.

I am planning to release the steering patch for the GLM 5.2 eliminating pro-CCP alignment in the next few days.

bwat49 9 hours ago | parent | prev [-]

this seems rather counter-productive, wouldn't a model with less cybersecurity capabilities be more likely to produce insecure code? Not to mention, Chinese models don't have these restrictions and can be used to exploit said unsecure code.

I supposed I shouldn't be surprised at how the trump admin is approaching AI regulation, counter-productive is really all they do

ihsw 7 hours ago | parent [-]

As contradictory as it sounds, they (Anthropic) are probably trying to dance the fine line where its public models can write secure code but cannot exploit insecure code.

MostlyStable 9 hours ago | parent | prev | next [-]

Why do you think they are bragging? Anthropic has long been the company to give us by far the most in-depth information about their models, both positive and negative. I read this as them just stating a fact about this model that users would want to know.

organsnyder 9 hours ago | parent | next [-]

I'm absolutely certain that their marketing team has input on (if not owning) these announcements.

gallerdude 9 hours ago | parent | next [-]

Of course. But is it really impossible that Dario’s directive to the marketing team is “try not to make us look bad, but also be honest about our models’ capabilities, so people can stay informed”?

MostlyStable 9 hours ago | parent | prev [-]

I find it interesting how two different directly opposed messages seem to have both been interpreted as being nothing but marketing speak.

MallocVoidstar 9 hours ago | parent | prev | next [-]

The preceding sentence is

>Our safety assessments found that Sonnet 5 shows an overall lower rate of undesirable behaviors than Sonnet 4.6, and is generally safer to use in agentic contexts.

which is obviously painting that as a good thing. So reading the next sentence as "in other good news" is reasonable.

MostlyStable 9 hours ago | parent [-]

While I'm still not sure I would characterize that as bragging, you're right that that is a fair interpretation. However, another Fair interpretation of that is something along the lines of "the downside or cost of this positive thing is this following negative thing."

satvikpendem 9 hours ago | parent | prev [-]

Anthropomorphic, most in-depth? That's laughable given how closed down they've been over the years. If you want in-depth, DeepSeek actually still publishes papers of their methods for anyone to implement leading to being by far the most cost efficient model provider for the performance.

MostlyStable 9 hours ago | parent [-]

I was talking about reporting on testing and capabilities. Yes, open models provide a greater amount of information about the development of the model and how to run it yourself, but I am quite confident that literally no AI company, open or closed, conducts and reports so thoroughly on testing about the capabilities of their models.

K0balt 9 hours ago | parent | prev | next [-]

Restricting the models isn’t about restricting offensive capabilities. They were already very well aligned to reduce that risk.

This recent government interference is about trying to preserve US offensive cyberwarfare and cyberespionage capabilities. It’s not about “bad actors”. It’s about defensive capabilities becoming pervasive and cheap, which would kneecap us cyberoffensive capability.

It’s like making seatbelts illegal so that police chases can be more effective.

bluepeter 9 hours ago | parent | prev | next [-]

Flowers for Algernon. And, sadly, expect this from now on. You saw it with OpenAI releasing Sol/Terra/Luna with a chart showing how they weren't quite as good as Mythos. It's all messaging to the USG to try to avoid/minimize arbitrary review from multiple agencies. 'Hey, it's smart, but look how stupid it is at "cyber."'

dgacmu 9 hours ago | parent | prev | next [-]

One of the best queries I've done with an LLM recently was: Create a plan for improving the robustness and resilience of this code, particularly to untrusted inputs.

Gemini wouldn't do a security audit. But it came up with a great set of mitigations and identified an extant XSS flaw in the process of improving robustness.

There's an awful lot of good that can come from proactive, defensive use of LLMs. I realize there's also a lot of pain when the difficulty of exploit finding drops suddenly, but in the long term we may all benefit from the defensive side of this.

lanthissa 9 hours ago | parent | prev | next [-]

so it doesn't get blocked. last time they said a model was great at cyber it didnt turn out well

Philpax 10 hours ago | parent | prev | next [-]

To avoid Lutnick getting on their case again.

dgellow 9 hours ago | parent [-]

He has the opportunity to do the funniest thing ever

johnfn 9 hours ago | parent | prev | next [-]

> Why would they brag about something like this? It's like they know people want to use models to perform cybersecurity tasks yet knowingly deny them the ability.

What exactly do you want Anthropic to say here? "This model, the one we are about to give to the entire world for cheap, is really good at hacking"? Saying Sonnet is terrible at cybersecurity is the most reasonable thing they can say, out of a lot of bad options.

nozzlegear 9 hours ago | parent | prev | next [-]

It seems obvious to me that they put that in there in an effort to avoid another reaming out by the long, orange dick of the US government.

pseudosavant 7 hours ago | parent | prev | next [-]

So that the current US administration doesn't block broad usage of Sonnet 5 probably. They'd have to collect your ID and approve you if it was good at cybersecurity. Because such is the freedom in the U.S. right now.

doctoboggan 10 hours ago | parent | prev | next [-]

You have to pay more for that, and/or go through some USG vetting process.

10 hours ago | parent | prev | next [-]
[deleted]
2001zhaozhao 9 hours ago | parent | prev | next [-]

They are obviously trying to avoid getting Sonnet 5 blocked.

WithinReason 9 hours ago | parent | prev | next [-]

That part is likely directly addressed to the US government.

chvid 9 hours ago | parent | prev | next [-]

Does it mean it generates code with random security holes?

jayd16 9 hours ago | parent | prev | next [-]

Market segmentation?

re-thc 9 hours ago | parent | prev | next [-]

> And Opus 4.8 is still cheaper for a higher pass rate

Unless it spams as much as Opus, I doubt it. Opus 4.8 literally spams text like puke. On a longer run especially if you get cache misses here and there the bulk of the cost is all the extra context it adds.

drcongo 9 hours ago | parent | prev [-]

What makes that a brag?