Remix.run Logo
xarope 2 days ago

I can see how the AI companies would work around this though:

user queries "static" training data in LLM; LLM guesses something, then searches internet in real-time for data to support the guesses. This would be classified as "browsing" rather than trawling.

(the searched data then get added back into the corpus, thus sadly sidestepping all the anti-AI trawling mechanisms)

Kind of like the way a normal user would.

The problem is, as others have already mentioned, how would the LLMs know what is a good answer versus a bad, when a "normal" user also has this issue?