| ▲ | jFriedensreich 3 days ago | ||||||||||||||||
I am confused this article does not talk about taint tracking. If state was mutated by an agent with untrustworthy input the taint would transfer to the state, making it untrustworthy input too, so the reasoning of the original trifecta with taint tracking is more general and practical. I am also also investigating the direction of tracking taints as scores rather than binary as most use cases would otherwise be impossible to do at all autonomous. Eg. with sensitivity scores to data, trust scores to inputs (that can be improved by eg. human review). One important limit that needs way more research is how to transfer the minimal needed information from a tainted context into an untainted fresh context without transferring all the taints. The only solution i currently have is by compaction and human review, if possible aided with schema enforcement and optimised UI for the use case. This unfortunately cannot solve encoded information that humans cannot see, but it seems that issue will never be solvable outside alignment research. PS: An example how scores are helpful: Using browser tab titles in the context would by definition have the worst trust score possible. But truncating titles to only the user-visible parts could lower this to acceptable for autonomous execution if the data was just mildly sensitive. | |||||||||||||||||
| ▲ | simonw 3 days ago | parent | next [-] | ||||||||||||||||
Have you seen the DeepMind CaMeL paper? It describes a taint tracking system that works by generating executable code that can have the source of data tracked as it moves through the program: https://simonwillison.net/2025/Apr/11/camel/ | |||||||||||||||||
| |||||||||||||||||
| ▲ | causal 3 days ago | parent | prev | next [-] | ||||||||||||||||
Totally. I think the original "Lethal trifecta" post by OP only pertained to data exfiltration and never included changing state (maybe was implied by sensitive data access). Rule of 2 model has holes. | |||||||||||||||||
| ▲ | cmrx64 3 days ago | parent | prev [-] | ||||||||||||||||
there has to be a better name for information flow security policy checking than taint tracking | |||||||||||||||||