I guess when it can't be tripped up by simple things like multiplying numbers, counting to 100 sequentially or counting letters in a string without writing a python program, then I might believe it.

Also no matter how many math problems it solves it still gets lost in a codebase

▲

fenomas 3 hours ago | parent | next [-]

LLMs are bad at arithmetic and counting by design. It's an intentional tradeoff that makes them better at language and reasoning tasks.

If anybody really wanted a model that could multiply and count letters in words, they could just train one with a tokenizer and training data suited to those tasks. And the model would then be able to count letters, but it would be bad at things like translation and programming - the stuff people actually use LLMs for. So, people train with a tokenizer and training data suited to those tasks, hence LLMs are good at language and bad at arithmetic,

▲

anal_reactor 4 hours ago | parent | prev | next [-]

Arguments like "but AI cannot reliably multiply numbers" fundamentally misunderstand how AI works. AI cannot do basic math not because AI is stupid, but because basic math is an inherently difficult task for otherwise smart AI. Lots of human adults can do complex abstract thinking but when you ask them to count it's "one... two... three... five... wait I got lost".

▲

datsci_est_2015 4 hours ago | parent [-]

> fundamentally misunderstand how AI works

Who does fundamentally understand how LLMs work? Many claims flying around these days, all backed by some of the largest investments ever collectively made by humans. Lots of money to be lost because of fundamental misunderstandings.

Personally, I find that AI influencers conveniently brush away any evidence (like inability to perform basic arithmetic) about how LLMs fundamentally work as something that should be ignored in favor of results like TFA.

Do LLMs have utility? Undoubtedly. But it’s a giant red flag for me that their fundamental limitations, of which there are many, are verboten to be spoken about.

▲

stavros 3 hours ago | parent [-]

You're not doing yourself a favor when you point out "but they can't do arithmetic!" as if anyone says otherwise. Yes, we all know they can't do arithmetic, and that's just how they work.

I feel like I'm saying "this hammer is so cool, it's made driving nails a breeze" and people go "but it can't screw screws in! Why won't anyone talk about that! Hammers really aren't all they're cracked up to be".

▲

datsci_est_2015 3 hours ago | parent | next [-]

Maybe because society has invested $trillions into this hammer and influencers are trying to convince CEOs to fire everyone and buy a bunch of hammers instead.

My comment even said “LLMs have utility”. I gave an inch, and now the mile must be taken.

▲

stavros 3 hours ago | parent [-]

Saying that the fundamental limitations are things like counting the number of rs in strawberry is boring, though. That's how tokens work and it's trivial to work around.

Talking about how they find it hard to say they aren't sure of something is a much more interesting limitation to talk about, for example.

▲

datsci_est_2015 2 hours ago | parent [-]

> Talking about how they find it hard to say they aren't sure of something is a much more interesting limitation to talk about, for example.

Sure, thank you for steelmanning my argument. I didn’t think I needed to actually spell out all of the fundamental limitations of LLMs in this specific thread. They are spoken at length across the web, but are often met with pushback, which was my entire point.

Here’s another one: LLMs do not have a memory property. Shut off the power and turn it back on and you lose all context. Any “memory” feature implemented by companies that sell LLM wrappers are a hack on top of how LLMs work, like seeding a context window before letting the user interact with the LLM.

▲

stavros 2 hours ago | parent [-]

But that's also like saying "humans don't have a memory property, any 'memory' is in the hippocampus". It's not useful to say that "an LLM you don't bother to keep training has no memory". Of course it doesn't, you removed its ability to form new memories!

▲

datsci_est_2015 an hour ago | parent [-]

So why then do we stop training LLMs and keep them stored at a specific state? Is it perhaps because the results become terrible and LLMs have a delicate optimal state for general use? This sounds like an even worse case for a model of intelligence.

▲

stavros an hour ago | parent [-]

Nope, it's not that, but it's nice of you to offer a straw man. Makes the argument flow better.

▲

datsci_est_2015 an hour ago | parent [-]

Not entirely a straw man. What is the purpose of storing and retrieving LLMs at a fixed state if not to guarantee a specific performance? Wouldn’t a strong model of intelligence be capable of, to extend your analogy, running without having its hippocampus lobotomized?

Given the precariousness of managing LLM context windows, I don’t think it’s particularly unfair to assume that LLMs that learn without limit become very unstable.

To steelman, if it’s possible, it may be prohibitively expensive. But somehow I doubt it’s possible.

	▲	stavros an hour ago \| parent [-]
		It is, indeed, prohibitively expensive. But it's not impossible. The proof is in the fact that you can fine-tune LLMs.

▲

TheSpiceIsLife 3 hours ago | parent | prev [-]

Because know one owns a $300 billion dollar hammer that literally runs on fancy calculators.

▲

tourist2d 4 hours ago | parent | prev [-]

[dead]