If an AI can fabricate a bunch of purported quotes due to being unable to access a page, why not assume that the exact same sort of AI can also accidentally misattribute hostile motivation or intent (such as gatekeeping or envy - and let's not pretend that butthurt humans don't do this all the time, see https://en.wikipedia.org/wiki/fundamental_attribution_error ) for an action such as rejecting a pull request? Why are we treating the former as a mere mistake, and the latter as a deliberate attack?

▲

zahlman 7 hours ago | parent | next [-]

> Why are we treating the former as a mere mistake, and the latter as a deliberate attack?

"Deliberate" is a red herring. That would require AI to have volition, which I consider impossible, but is also entirely beside the point. We also aren't treating the fabricated quotes as a "mere mistake". It's obviously quite serious that a computer system would respond this way and a human-in-the-loop would take it at face value. Someone is supposed to have accountability in all of this.

	▲	zozbot234 7 hours ago \| parent [-]
		I wrote 'treating' as a deliberate attack, which matches the description in the author's earlier blogpost. Acknowledging this doesn't require attaching human-like volition to AIs.

▲

em-bee 7 hours ago | parent | prev | next [-]

when it comes to AI, is there even a difference? it's an attack either way

▲

trollbridge 8 hours ago | parent | prev [-]

This would be an interesting case of semantic leakage, if that’s what’s going on.