This is becoming a bit scary. I almost hope we'll reach some kind of plateau for llm intelligence soon.

A plateau is unlikely, at least for cybersecurity. RL scales well here and is replicable outside of Anthropic (rewards are verifiable, so setting up the training environment doesn't require that much cleverness).

The post also points out that the model wasn't trained specifically on cybersecurity, and that it was just a side-effect – so I think there's still a lot of headroom.

It's scary, but there's also some room for cautious non-pessimism. More people than ever can cause billions of dollars of damage in attacks now [1], but the same tools can be used for defensive use. For that reason, I'm more optimistic about mitigations in security vs. other risk areas like biosecurity.

[1]: https://www.noahlebovic.com/testing-an-autonomous-hacker/

▲

hibikir 2 hours ago | parent | prev | next [-]

On a topic like cybersecurity, we never win by not looking: One needs top of the line knowledge of how to break a system to be able to protect it. We have that dilemma dealing with human experts: The same government sponsored unit that tells you that you need to update your encryption can hold on to the information and use it to exploit it at their leisure.

Given that it's absolutely impossible to stop people not aligned with us (for any definition of us) from doing AI research, the most reasonable way forward is to dedicate compute resources to the frontier, and to automatically send reasonable disclosures to major projects. It could in itself be a pretty reasonable product. Just like you pay for dubious security scans and publish that you are making them, an LLM company could offer actually expensive security reviews with a preview model, and charge accordingly.

▲

dist-epoch an hour ago | parent | prev | next [-]

The immediate plateau is the energy output of the Sun captured by the Dyson Swarm around it. Until there it's smooth sailing.

▲

esafak 2 hours ago | parent | prev | next [-]

We need to promote alignment and other ethics benchmarks; we can't change what we don't measure. I don't even know any off the top of my head.

▲

websap 3 hours ago | parent | prev [-]

If we don't innovate, someone else will. This is the very nature of being a human being. We summit mountains, regardless of the danger or challenge.

▲

vonneumannstan 3 hours ago | parent [-]

>If we don't innovate, someone else will.

Terrible take. You don't get to push the extinction button just because you think China will beat you to the punch.

>This is the very nature of being a human being. We summit mountains, regardless of the danger or challenge.

No, just no... We barely survived the Cold War, at times because of pure luck. AI is at least as dangerous as that, if not more. We have far exceeded our wisdom relative to our capabilities. As you have so cleanly demonstrated.

	▲	dist-epoch an hour ago \| parent [-]
		You assume there is the option of not pushing the extinction button. Nobody asked chimps if they wanted humans around. This processes are outside control.