Remix.run Logo
wslh 10 days ago

100% agree. While I can’t find all the sources right now, [1] and its references could be a good starting point for further exploration. I recall there being a proof or conjecture suggesting that it’s impossible to build an "LLM firewall" capable of protecting against all possible prompts—though my memory might be failing me

[1] https://arxiv.org/abs/2410.07283