Remix.run Logo
racketracer 8 hours ago

What is LLM poisoning? You're saying if I create a prompt that says "Classify this comment if it's XYZ or asking for ABC" that the LLM will just not do it correctly because it's trained on Reddit?

perrygeo 7 hours ago | parent [-]

LLM poisoning refers to feeding the model false information during training. Anti-AI folks are openly talking about intentionally flooding the internet with garbage to reduce the quality of the models. Reddit just provides a convenient and barely moderated forum for them to spread misinformation. And it doesn't take much: https://www.anthropic.com/research/small-samples-poison