Remix.run Logo
stavros 2 hours ago

You're not doing yourself a favor when you point out "but they can't do arithmetic!" as if anyone says otherwise. Yes, we all know they can't do arithmetic, and that's just how they work.

I feel like I'm saying "this hammer is so cool, it's made driving nails a breeze" and people go "but it can't screw screws in! Why won't anyone talk about that! Hammers really aren't all they're cracked up to be".

datsci_est_2015 an hour ago | parent | next [-]

Maybe because society has invested $trillions into this hammer and influencers are trying to convince CEOs to fire everyone and buy a bunch of hammers instead.

My comment even said “LLMs have utility”. I gave an inch, and now the mile must be taken.

stavros an hour ago | parent [-]

Saying that the fundamental limitations are things like counting the number of rs in strawberry is boring, though. That's how tokens work and it's trivial to work around.

Talking about how they find it hard to say they aren't sure of something is a much more interesting limitation to talk about, for example.

datsci_est_2015 an hour ago | parent [-]

> Talking about how they find it hard to say they aren't sure of something is a much more interesting limitation to talk about, for example.

Sure, thank you for steelmanning my argument. I didn’t think I needed to actually spell out all of the fundamental limitations of LLMs in this specific thread. They are spoken at length across the web, but are often met with pushback, which was my entire point.

Here’s another one: LLMs do not have a memory property. Shut off the power and turn it back on and you lose all context. Any “memory” feature implemented by companies that sell LLM wrappers are a hack on top of how LLMs work, like seeding a context window before letting the user interact with the LLM.

stavros an hour ago | parent [-]

But that's also like saying "humans don't have a memory property, any 'memory' is in the hippocampus". It's not useful to say that "an LLM you don't bother to keep training has no memory". Of course it doesn't, you removed its ability to form new memories!

datsci_est_2015 3 minutes ago | parent [-]

So why then do we stop training LLMs and keep them stored at a specific state? Is it perhaps because the results become terrible and LLMs have a delicate optimal state for general use? This sounds like an even worse case for a model of intelligence.

TheSpiceIsLife 2 hours ago | parent | prev [-]

Because know one owns a $300 billion dollar hammer that literally runs on fancy calculators.