| ▲ | gchamonlive 5 days ago | |||||||
> it's been proven that it doesn't summarize, but rather abridges and abbreviates data Do you have more resources on that? I'd love to read about the methodology. > And therefore it's impossible to test the accuracy if it's consuming your own data. Isn't it only if it's hard to verify the result? If it's a result that's hard to produce but easy to verify, a class which many problems fall into, you'd just need to look at the synthetized results. If you ask it "given these arbitrary metrics, what is the best business plan for my company?" It'd be really hard to verify the result. I'd be hard to verify the result from anyone for that matter, even specialists. So I think it's less about expecting the LLM to do autonomous work and more about using LLMs to more efficiently help you search the latent space for interesting correlations, so that you and not the LLM come up with the insights. | ||||||||
| ▲ | thoughtpeddler 4 days ago | parent | next [-] | |||||||
Look into the emerging literature around "needle-in-a-haystack" tests of LLM context windows. You'll see what the poster you're replying to is describing, in part. This can also be described as testing "how lazy is my LLM being when it comes to analyzing the input I've provided to it?" Hint: they can get quite lazy! I agree with the poster you replied to that "RAG my Obsidian"-type experiments with local models are middling at best. I'm optimistic things will get a lot better in the future, but it's hard to trust a lot of the 'insights' this blog post talks about, without intense QA-ing (if the author did it, which I doubt, considering their writing is also lazily mostly AI-assisted as well). | ||||||||
| ▲ | bdbdbdb 5 days ago | parent | prev [-] | |||||||
> If you ask it "given these arbitrary metrics, what is the best business plan for my company?" It'd be really hard to verify the result. I'd be hard to verify the result from anyone for that matter, even specialists. Hard to verify something so subjective, for sure. But a specialist will be applying intelligence to the data. An LLM is just generating random text strings that sound good. The source for my claim about LLMs not summarizing but abbreviating is on hn somewhere, I'll dig it out Edit: sorry, I tried but couldn't find the source. | ||||||||
| ||||||||