| ▲ | ramoz 7 hours ago | |
This doesn’t solve the problem. The lethal trifecta as defined is not solvable and is misleading in terms of “just cut off a leg”. (Though firewalling is practically a decent bubble wrap solution). But for truly sensitive work, you still have many non-obvious leaks. Even in small requests the agent can encode secrets. An AI agent that is misaligned will find leaks like this and many more. | ||