Remix.run Logo
throwaw12 3 hours ago

are we cooked yet?

Benchmarks look very impressive! even if they're flawed, it still translates to real world improvements

boring-human an hour ago | parent | next [-]

Yep, I think the lede might be buried here and we're probably cooked (assuming you mean SWEs, but the writing has been on the wall for 4 months.)

I guess I'm still excited. What's my new profession going to be? Longer term, are we going to solve diseases and aging? Or are the ranks going to thin from 10B to 10000 trillionaires and world-scale con-artist misanthropes plus their concubines?

whalesalad 3 hours ago | parent | prev [-]

There is an entire section on crafting chemical/bio weapons so yeah I think we are cooked.

redfloatplane 3 hours ago | parent [-]

There's been a section on this in nearly every system card anthropic has published so this isn't a new thing - and, this model doesn't have particularly higher risk than past models either:

> 2.1.3.2 On chemical and biological risks

> We believe that Mythos Preview does not pass this threshold due to its noted limitations in open-ended scientific reasoning, strategic judgment, and hypothesis triage. As such, we consider the uplift of threat actors without the ability to develop such weapons to be limited (with uncertainty about the extent to which weapons development by threat actors with existing expertise may be accelerated), even if we were to release the model for general availability. The overall picture is similar to the one from our most recent Risk Report.