Remix.run Logo
reassess_blind 3 hours ago

Not as smart as everyone thinks it is, maybe, but a model like Fable 5 without safeguards against offensive cyber attacks would be a nightmare. There are millions of improperly secured web applications that, in the wrong hands, would be easily exploited by these models.

lillesvin 3 hours ago | parent | next [-]

There have been millions of trivially exploitable vulnerabilities out there for decades — many of which could be easily discovered by using simple scanning tools or manual probing. This is hardly a new situation and LLMs really aren't that impressive at pentesting — even with these simple exploits. Maybe they are if you're not a pentester, but then ZAP, Burp, Nessus, SQLMap, etc. are likely also impressive if you put a little effort into learning how to use them, but many AI-advocates aren't interested in learning skills themselves.

It's the same situation as with vibe coding. Everyone and their grandma can have an LLM spit out a web application without any programming experience, but if you're a programmer, you'll likely quickly see some issues with maintainability and further development of the code base.

zomiaen 2 hours ago | parent | next [-]

>LLMs really aren't that impressive at pentesting

The point is that Mythos apparently is quite capable and has developed novel exploits on its own.

lillesvin 2 hours ago | parent [-]

That's the claim, yes. Has any proof been made available yet? (Genuinely asking here because I haven't been paying that close attention.)

reassess_blind 2 hours ago | parent | prev [-]

[dead]

tayo42 3 hours ago | parent | prev [-]

In a substantially different way then how it is now? You can put something listening on 22, 80 and 443 and log how much stuff tries to get in.

reassess_blind 2 hours ago | parent [-]

Yes, it is substantially different. A targeted, relentless attack by a state of the art cybersecurity model is far more likely to find obscure vulnerabilities than a traditional automated attack/fuzzer. These models are so much better at finding security holes than anything we've seen before.