> Having seen LLMs so many times produce coherent, sensible and valid chains of reasoning to diagnose issues and bugs in software I work on, I am at this point in absolutely no doubt that they are thinking.

While I'm not willing to rule *out* the idea that they're "thinking" (nor "conscious" etc.), the obvious counter-argument here is all the records we have of humans doing thinking, where the records themselves are not doing the thinking that went into creating those records.

And I'm saying this as someone whose cached response to "it's just matrix multiplication it can't think/be conscious/be intelligent" is that, so far as we can measure all of reality, everything in the universe including ourselves can be expressed as matrix multiplication.

Falsification, not verification. What would be measurably different if the null hypothesis was wrong?

▲

chpatrick 2 days ago | parent [-]

I've definitely had AIs thinking and producing good answers about specific things that have definitely not been asked before on the internet. I think the stochastic parrot argument is well and truly dead by now.

▲

Earw0rm 2 days ago | parent | next [-]

I've also experienced this, to an extent, but on qualitative topics the goodness of an answer - beyond basic requirements like being parseable and then plausible - is difficult to evaluate.

They can certainly produce good-sounding answers, but as to the goodness of the advice they contain, YMMV.

	▲	chpatrick 2 days ago \| parent [-]
		I've certainly got useful and verifiable answers. If you're not sure about something you can always ask it to justify it and then see if the arguments make sense.

▲

hitarpetar 2 days ago | parent | prev [-]

how do you definitely know that?

▲

stinos 2 days ago | parent | next [-]

Also, does it matter?

The point being made here is about the data LLMs have been trained with. Sure that contains questions&answers but obviously not all of it is in that form. Just like an encyclopedie contains answers without the questions. So imo specifying this as 'no-one asked this before' is irrelevant.

More interesting: did OP get a sensible answer to a question about data which definitely was not in the training set? (and indeed, how was this 'definitely' established'). Not that if the answer is 'yes' that'll prove 'thinking', as opposed to calling it e.g. advanced autocompletion, but it's a much better starting point.

▲

chpatrick 2 days ago | parent | prev [-]

Because I gave them a unique problem I had and it came up with an answer it definitely didn't see in the training data.

Specifically I wanted to know how I could interface two electronic components, one of which is niche, recent, handmade and doesn't have any public documentation so there's no way it could have known about it before.

▲

stinos 2 days ago | parent [-]

one of which is niche, recent, handmade and doesn't have any public documentation

I still see 2 possibilities: you asked it something similar enough that it came up with a fairly standard answer which just happened to be correct, or you gave it enough info.

- for example you created a new line of MCUs called FrobnicatorV2, and asked is 'how do I connect a power supply X to FrobnicatorV2' and it gave an answer like 'connect red wire to VCC and black to GND'. That's not exactly special.

- or, you did desribe that component in some way. And you did do that using standard electronics lingo so essentially in terms of other existing components which it definitely did know (unless you invented something completely new not using any currently know physics). As such it's irrelevant that your particular new component wasn't known because you gave away the answer by describing it? E.g. you aksed it 'how do I connect a power supply X to an MCU with power pins Y and Z'. Again nothing special.

▲

chpatrick 2 days ago | parent [-]

If a human uses their general knowledge of electronics to answer a specific question they haven't seen before that's obviously thinking. I don't see why LLMs are held to a different standard. It's obviously not repeating an existing answer verbatim because that doesn't exist in my case.

You're saying it's nothing "special" but we're not discussing whether it's special, but whether it can be considered thinking.

▲

stinos a day ago | parent [-]

it's obviously not repeating an existing answer verbatim

Not verbatim in the sense that the words are different doesn't make it thinking. Also when we say 'humans think' that means a lot more than only 'new question generates correct answer' or 'smart autocompletion'. See a lot of other comments here for details.

But again: I laid out 2 possibilities explaining why the question might in fact not be new, nor the data, so I'm curious which of the 2 (or another) explains the situation you're talking about.

You're saying it's nothing "special" but we're not discussing whether it's special, but whether it can be considered thinking.

Apologies, with 'special' I did in fact mean 'thinking'

	▲	chpatrick a day ago \| parent [-]
		Sufficiently smart auto complete is indistinguishable from thinking, I don't think that means anything.