Remix clone Hacker News

new | show | ask | jobs Github

	▲	alex1138 5 days ago
		I just can't stop thinking though about the vulnerability of training data You say good enough. Great, but what if I as a malicious person were to just make a bunch of internet pages containing things that are blatantly wrong, to trick LLMs?
	▲	calflegal 5 days ago \| parent \| next [-]
		The internet has already tried this, for about a few decades. The garbage is in the corpus; it gets weighted as such
	▲	floundy 4 days ago \| parent \| prev [-]
		>a bunch of internet pages containing things that are blatantly wrong So Reddit? I’d imagine the AI companies have all the “pre AI internet” data they scraped very carefully catalogued.