| ▲ | GuB-42 10 hours ago | |
Yes, agents. But for that, I think that the usual approaches to censor LLMs are not going to cut it. It is like making a text box smaller on a web page as a way to protect against buffer overflows, it will be enough for honest users, but no one who knows anything about cybersecurity will consider it appropriate, it has to be validated on the back end. In the same way a LLM shouldn't have access to resources that shouldn't be directly accessible to the user. If the agent works on the user's data on the user's behalf (ex: vibe coding), then I don't consider jailbreaking to be a big problem. It could help write malware or things like that, but then again, it is not as if script kiddies couldn't work without AI. | ||
| ▲ | calibas 9 hours ago | parent [-] | |
> If the agent works on the user's data on the user's behalf (ex: vibe coding), then I don't consider jailbreaking to be a big problem. It could help write malware or things like that, but then again, it is not as if script kiddies couldn't work without AI. Tricking it into writing malware isn't the big problem that I see. It's things like prompt injections from fetching external URLs, it's going to be a major route for RCE attacks. https://blog.trailofbits.com/2025/10/22/prompt-injection-to-... There's plenty of things we should be doing to help mitigate these threats, but not all companies follow best practices when it comes to technology and security... | ||