| ▲ | GuB-42 12 hours ago | ||||||||||||||||||||||||||||||||||
I don't see the big issues with jailbreaks, except maybe for LLMs providers to cover their asses, but the paper authors are presumably independent. That LLMs don't give harmful information unsolicited, sure, but if you are jailbreaking, you are already dead set in getting that information and you will get it, there are so many ways: open uncensored models, search engines, Wikipedia, etc... LLM refusals are just a small bump. For me they are just a fun hack more than anything else, I don't need a LLM to find how to hide a body. In fact I wouldn't trust the answer of a LLM, as I might get a completely wrong answer based on crime fiction, which I expect makes up most of its sources on these subjects. May be good for writing poetry about it though. I think the risks are overstated by AI companies, the subtext being "our products are so powerful and effective that we need to protect them from misuse". Guess what, Wikipedia is full of "harmful" information and we don't see articles every day saying how terrible it is. | |||||||||||||||||||||||||||||||||||
| ▲ | cseleborg 12 hours ago | parent | next [-] | ||||||||||||||||||||||||||||||||||
If you create a chatbot, you don't want screenshots of it on X helping you to commit suicide or giving itself weird nicknames based on dubious historic figures. I think that's probably the use-case for this kind of research. | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
| ▲ | calibas 11 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||
I see an enormous threat here, I think you're just scratching the surface. You have a customer facing LLM that has access to sensitive information. You have an AI agent that can write and execute code. Just image what you could do if you can bypass their safety mechanisms! Protecting LLMs from "social engineering" is going to be an important part of cybersecurity. | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||