Remix.run Logo
johnnyanmac 3 days ago

It's not training on books, but it will answer questions about the book you're reading. Doesn't pass the sniff test.

>My device, my content

I don't think you own the kindle store and servers used to train the Ai.

terafo 3 days ago | parent | next [-]

There are LLM's that can process 1 million token context window. Amazon Nova 2 for one, even though it's definitely not the highest quality model. You just put whole book in context and make LLM answer questions about it. And given the fact that domain is pretty limited, you can just store KV cache for most popular books on SSD, eliminating quite a bit of cost.

DennisP 3 days ago | parent [-]

You could also fill the context with just the book portion that you've read. That'd be a sure-fire way to fulfill Amazon's "spoiler-free" promise.

catgary 3 days ago | parent | prev | next [-]

Are you implying that an LLM needs to be trained on a specific piece of text to answer questions about it?

johnnyanmac 3 days ago | parent [-]

If you want proper answers, yes. If you want to rely on whatever reddit or tiktok says about the book, then I guess at that point you're fine with hallucinations and others doing the thinking for you anyway. Hence the issues brought up in the article.

I wouldn't trust an LLM for anything more than the most basic questions of it didn't actually have text to cite.

catgary 3 days ago | parent | next [-]

Luckily, the LLM has the text to cite, it can be passed in at inference time, which is legally distinct from training on the data.

terafo 3 days ago | parent | prev [-]

Having access to the text and being trained on the text are two different things.

3 days ago | parent | prev | next [-]
[deleted]
tshaddox 3 days ago | parent | prev [-]

> It's not training on books, but it will answer questions about the book you're reading. Doesn't pass the sniff test.

What do you mean? Presumably the implication is that it will essentially read the book (or search through it) in order to answer questions about it. An LLM can of course summarize text that's not in its training set.

johnnyanmac 3 days ago | parent [-]

"Reads the book" is the issue, yes. It's possible they aren't training. Vit to be frank, we're long past the BOTD where tech companies aren't going to attempt to traon on every little thing fed into their servers.

Happy to be proven wrong, though.