▲ | jsheard 4 hours ago | ||||||||||||||||||||||||||||||||||||||||
It is known that the LAION dataset underpinning foundation models like Stable Diffusion contained at least a few thousand instances of real-life CSAM at one point. I think you would be hard-pressed to prove that any model trained on internet scrapes definitively wasn't trained on any CSAM whatsoever. https://www.theverge.com/2023/12/20/24009418/generative-ai-i... | |||||||||||||||||||||||||||||||||||||||||
▲ | defrost 4 hours ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||
> I think you would be hard-pressed to prove that any model trained on internet scrapes definitively wasn't trained on any CSAM whatsoever. I'd be hard pressed to prove that you definitely hadn't killed anybody ever. Legally if it's asserted that these images are criminal because they are the result of being the product of an LLM trained on sources that contained CSAM then the requirement would be to prove that assertion. With text and speech you could prompt the model to exactly reproduce a Sarah Silverman monologue and assert that proves her content was used in the training set, etc. Here the defense would ask the prosecution to demonstrate how to extract a copy of original CSAM. But your point is well taken, it's likely most image generation programs of this nature have been fed at least one image that was borderline jailbait and likely at least one that was well below the line. | |||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||
▲ | lazyasciiart 4 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||
Then all image generation models should be considered inherently harmful, no? | |||||||||||||||||||||||||||||||||||||||||
▲ | Hizonner 4 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||
I think you'd be hard-pressed to prove that a few thousand images (out of over 5 billion in the case of that particular data set) had any meaningful effect on the final model capabilities. |