| ▲ | alex1138 5 days ago | |
I just can't stop thinking though about the vulnerability of training data You say good enough. Great, but what if I as a malicious person were to just make a bunch of internet pages containing things that are blatantly wrong, to trick LLMs? | ||
| ▲ | calflegal 5 days ago | parent | next [-] | |
The internet has already tried this, for about a few decades. The garbage is in the corpus; it gets weighted as such | ||
| ▲ | floundy 4 days ago | parent | prev [-] | |
>a bunch of internet pages containing things that are blatantly wrong So Reddit? I’d imagine the AI companies have all the “pre AI internet” data they scraped very carefully catalogued. | ||