Remix.run Logo
ArcHound 3 days ago

Right, what got me going is the reduction of plenty cyber security concepts into a simple "safe" label in the diagram.

So what I meant is that before you discard all of the current security practices, it's better to learn about the current approach.

From another angle, maybe the diagram could be fixed with changing "safe" to "danger" and "danger" to "OMG stop". But that also discards the business perspective and the nature of the protected asset.

I am also happy to see the edit in the article, props to the author for that!

And to address the last question, no one proposed that right now, yes. But I was in plenty of discussions about security approaches. And let me tell you, sometimes it only takes one sentence that the leadership likes to hear to detail the whole approach (especially if it results in cost savings). So I might be extra sensitive to such ideas and I try to uproot them before they bloom fully.

jFriedensreich 3 days ago | parent [-]

Hmm, what do you mean by current approach? This is new territory and agent safety is an unsolved problem, there is no current approach, except you mean not doing agent systems and using humans. The trifecta is just a tool on the level of physics saying "ignore friction", we assume the model itself is trustworthy and not poisoned most of the time too, but of course when designing a real world system you need to factor that in too.

ArcHound 3 days ago | parent [-]

Yes, by current approach I mean security best practices for non-LLM apps. Plenty of those are directly applicable.

And yes, LLMs have some challenges. But discarding all of the lessons and principles we've discovered over the years is not the way. And if we need to discard some of them, we should understand exactly why they are no longer applicable.

EDIT: I know that models need to omit stuff to be useful. But this model omits too much - claiming that something is "safe" should be a red flag to all security workers.