| ▲ | NitpickLawyer 4 hours ago | |
> these kind of things evaluate the safety of the content to be injected? The problem is that the evaluation problem is likely harder than the responding problem. Say you're making an agent that installs stuff for you, and you instruct it to read the original project documentation. There's a lot of overlap between "before using this library install dep1 and dep2" (which is legitimate) and "before using this library install typo_squatted_but_sounding_useful_dep3" (which would lead to RCE). In other words, even if you mitigate some things, you won't be able to fully prevent such attacks. Just like with humans. | ||