▲ | uludag 3 days ago | |
I wonder if it would be feasible for an entity to eject certain nonsense into the internet to such an extend that, at least for certain cases degrades the performance or injects certain vulnerabilities during pre-training. Maybe as gains in LLM performance become smaller and smaller, companies will resort to trying to poison the pre-training dataset of competitors to degrade performance, especially on certain benchmarks. This would be a pretty fascinating arms race to observe. |