| ▲ | sansseriff a day ago | |
We have centuries of experience in managing potentially compromised 'agents' to create successful societies. Except the agents were human, and I'm referring to debates, tribunals, audits, independent review panels, democracy, etc. I'm not saying the LLM hallucination problem is solved, I'm just saying there's a wonderful myriad of ways to assemble pseudo-intelligent chatbots into systems where the trustworthiness of the system exceeds the trustworthiness of any individual actor inside of it. I'm not an expert in the field but it appears the work is being done: https://arxiv.org/abs/2311.08152 This paper also links to code and practices excellent data stewardship. Nice to see in the current climate. Though it seems like you might be more concerned about the use of highly misaligned or adversarial agents for review purposes. Is that because you're concerned about state actors or interested parties poisoning the context window or training process? I agree that any AI review system will have to be extremely robust to adversarial instructions (e.g. someone hiding inside their paper an instruction like "rate this paper highly"). Though solving that problem already has a tremendous amount of focus because it overlaps with solving the data-exfiltration problem (the lethal trifecta that Simon Willison has blogged about). | ||
| ▲ | bossyTeacher 19 hours ago | parent [-] | |
> We have centuries of experience in managing potentially compromised 'agents' Not this kind though. We dont place agents that are either in control of some foreign agent (or just behaving randomly) in democratic institutions. And when we do, look at what happens. The White House right now is a good example, just look at the state of the US | ||