Remix.run Logo
visarga 2 days ago

> but have no architectural mechanism to separate facts from expressions

Sure they do. Every time a bot searches, reads your site and formulates an answer it does not replicate your expression. First of all, it compares across 20.. 100 sources. Second, it only reports what is related to the user query. And third - it uses its own expression. It's more like asking a friend who read those articles and getting an answer.

LLMs ability to separate facts from expression is quite well developed, maybe their strongest skill. They can translate, paraphrase, summarize, or reword forever.

PhantomHour 2 days ago | parent | next [-]

This is a baseless assertion of emergent behaviour.

> Every time a bot searches

We are talking about LLMs by themselves, not larger systems using them.

> LLMs ability to separate facts from expression is quite well developed

It is not. Whether you ask an LLM for an excerpt of the bible, or an excerpt of The Lord of the Rings, the LLM does not distinguish. It has no concept of what is, and what is not, under copyright.

squigz 2 days ago | parent | prev [-]

> LLMs ability to separate facts from expression is quite well developed, maybe their strongest skill.

There should presumably be data showing the reliability of LLMs' knowledge to be quite high, then?

ndriscoll 2 days ago | parent [-]

I don't see how that follows. It can learn a false "fact" while not retaining the way that statement was expressed. It can also just make up facts entirely, which by definition then did not come from any training data.