| ▲ | zmmmmm 3 hours ago | |
in a pure sense no, it's probably not solvable completely. But in a practical sense, yes, I think it's solvable enough to support broad use cases of significant value. The most unsolvable part is prompt injection. For that you need full tracking of the trust level of content the agent is exposed to and a method of linking that to what actions it has accessible to it. I actually think this needs to be fully integrated to the sandboxing solution. Once an agent is "tainted" its sandbox should inherently shrink down to the radius where risk is balanced with value. For example, my fully trusted agent might have a balance of $1000 in my AWS account, while a tainted one might have that reduced to $50. So another aspect of sanboxing is to make the security model dynamic. | ||