| ▲ | brap 7 hours ago | |
So you give it approval to the secret once, how can you be sure it wasn’t sent someplace else / persisted somehow for future sessions? Say you gave it access to Gmail for the sole purpose of emailing your mom. Are you sure the email it sent didn’t contain a hidden pixel from totally-harmless-site.com/your-token-here.gif? | ||
| ▲ | qup 5 hours ago | parent | next [-] | |
I don't have one yet, but I would just give it access to function calling for things like communication. Then I can surveil and route the messages at my own discretion. If I gave it access to email my mom (I did this with an assistant I built after chatgpt launch, actually), I would actually be giving it access to a function I wrote that results in an email. The function can handle the data anyway it pleases, like for instance stripping HTML | ||
| ▲ | zozbot234 7 hours ago | parent | prev [-] | |
The access to the secret, the long-term persisting/reasoning and the posting should all be done by separate subagents, and all exchange of data among them should be monitored. But this is easy in principle, since the data is just a plain-text context. | ||