I don't get why you would say that. it's just auto-completing. It cannot reason. It won't solve an original problem for which it has no prior context to "complete" an approximated solution with. you can give it more context and more data,but you're just helping it complete better. it does not derive an original state machine or algorithm to solve problems for which there are no obvious solutions. it instead approximates a guess (hallucination).

Consciousness and self-awareness are a distraction.

Consider that for the exact same prompt and instructions, small variations in wording or spelling change its output significantly. If it thought and reasoned, it would know to ignore those and focus on the variables and input at hand to produce deterministic and consistent output. However, it only computes in terms of tokens, so when a token changes, the probability of what a correct response would look like changes, so it adapts.

It does not actually add 1+2 when you ask it to do so. it does not distinguish 1 from 2 as discrete units in an addition operation. but it uses descriptions of the operation to approximate a result. and even for something so simple, some phrasings and wordings might not result in 3 as a result.

▲

slightwinder 2 days ago | parent | next [-]

> It won't solve an original problem for which it has no prior context to "complete" an approximated solution with.

Neither can humans. We also just brute force "autocompletion" with our learned knowledge and combine it to new parts, which we then add to our learned knowledge to deepen the process. We are just much, much better at this than AI, after some decades of training.

And I'm not saying that AI is fully there yet and has solved "thinking". IMHO it's more "pre-thinking" or proto-intelligence.. The picture is there, but the dots are not merging yet to form the real picture.

> It does not actually add 1+2 when you ask it to do so. it does not distinguish 1 from 2 as discrete units in an addition operation.

Neither can a toddler nor an animal. The level of ability is irrelevant for evaluating its foundation.

▲

cpt_sobel 2 days ago | parent | next [-]

> Neither can humans. We also just brute force "autocompletion"

I have to disagree here. When you are tasked with dividing 2 big numbers you most certainly don't "autocomplete" (with the sense of finding the most probable next tokens, which is what an LLM does), rather you go through set of steps you have learned. Same as with the strawberry example, you're not throwing guesses until something statistically likely to be correct sticks.

▲

slightwinder 2 days ago | parent | next [-]

Humans first start with recognizing the problem, then search through their list of abilities to find the best skill for solving it, thus "autocomplete" their inner shell's commandline, before they start execution, to stay with that picture. Common AIs today are not much different from this, especially with reasoning-modes.

> you're not throwing guesses until something statistically likely to be correct sticks.

What do you mean? That's exactly how many humans are operating with unknown situations/topics. If you don't know, just throw punches and look what works. Of course, not everyone is ignorant enough to be vocal about this in every situation.

▲

empath75 2 days ago | parent | prev [-]

> I have to disagree here. When you are tasked with dividing 2 big numbers you most certainly don't "autocomplete" (with the sense of finding the most probable next tokens, which is what an LLM does), rather you go through set of steps you have learned.

Why do you think that this is the part that requires intelligence, rather than a more intuitive process? Because they have had machines that can do this mechanically for well over a hundred years.

There is a whole category of critiques of AI of this type: "Humans don't think this way, they mechanically follow an algorithm/logic", but computers have been able to mechanically follow algorithms and perform logic from the beginning! That isn't thinking!

	▲	cpt_sobel a day ago \| parent [-]
		Good points - mechanically just following algorithms isn't thinking, and neither is "predicting the next tokens". But would a combination of the 2 then be close to what we define as thinking though?

▲

notepad0x90 2 days ago | parent | prev | next [-]

humans, and even animals track different "variables" or "entities" and distinct things with meaning and logical properties which they then apply some logical system on those properties to compute various outputs. LLMs see everything as one thing, in case of chat-completion models, they're completing text. in case of image generation, they're completing an image.

Look at it this way, two students get 100% on an exam. One learned the probability of which multiple choice options have the likelihood of being most correct based on how the question is worded, they have no understanding of the topics at hand, and they're not performing any sort of topic-specific reasoning. They're just good at guessing the right option. The second student actually understood the topics, reasoned, calculated and that's how they aced the exam.

I recently read about a 3-4 year old that impressed their teacher by reading perfectly a story book like an adult. it turns out, their parent read it to them so much, they can predict based on page turns and timing the exact words that need to be spoken. The child didn't know what an alphabet, word,etc.. was they just got so good at predicting the next sequence.

That's the difference here.

▲

slightwinder 2 days ago | parent [-]

I'd say, they are all doing the same, just in different domains and level of quality. "Understanding the topic" only means they have specialized, deeper contextualized information. But at the end, that student also just autocompletes their memorized data, with the exception that some of that knowledge might trigger a program they execute to insert the result in their completion.

The actual work is in gaining the knowledge and programs, not in accessing and executing them. And how they operate, on which data, variables, objects, worldview or whatever you call it, this might make a difference in quality and building speed, but not for the process in general.

▲

notepad0x90 2 days ago | parent [-]

> only means they have specialized, deeper contextualized information

no, LLMs can have that contextualized information. understanding in a reasoning sense means classifying the thing and developing a deterministic algorithm to process it. If you don't have a deterministic algorithm to process it, it isn't understanding. LLMs learn to approximate, we do that too, but then we develop algorithms to process input and generate output using a predefined logical process.

A sorting algorithm is a good example, when you compare that with an LLM sorting a list. they both may have correct outcome, but the sorting algorithm "understood" the logic and will follow that specific logic and have consistent performance.

▲

slightwinder 2 days ago | parent [-]

> understanding in a reasoning sense means classifying the thing and developing a deterministic algorithm to process it.

That's the learning-part I was talking about. Which is mainly supported by humans at the moment, which why I called it proto-intelligence.

> If you don't have a deterministic algorithm to process it, it isn't understanding.

Commercial AIs like ChatGPT do have the ability to call programs and integrate the result in their processing. Those AIs are not really just LLMs. The results are still rough and poor, but the concept is there and growing.

	▲	notepad0x90 2 days ago \| parent [-]
		> That's the learning-part I was talking about. Which is mainly supported by humans at the moment, which why I called it proto-intelligence. Maybe it's just semantics, but I don't think LLMs even come close to a fruit fly's intelligence. Why can't we recognize and accept them for what they are, really powerful classifiers of data. > Commercial AIs like ChatGPT do have the ability to call programs and integrate the result in their processing. Those AIs are not really just LLMs. The results are still rough and poor, but the concept is there and growing. Yeah RAG and all of that, but those programs use deterministic algorithms. Now, if LLMs generated programs they call on as tools, that would be much more like the proto-intelligence you're talking about. Semantics are boring, but it's important that we're not content or celebrate early by calling it what it isn't.

▲

staticman2 2 days ago | parent | prev | next [-]

>>> We also just brute force "autocompletion"

Wouldn't be an A.I. discussion without a bizarre, untrue claim that the human brain works identically.

▲

Workaccount2 2 days ago | parent | next [-]

There are no true and untrue claims about how the brain works, because we have no idea how it works.

The reason people give that humans are not auto-complete is "Obviously I am not an autocomplete"

Meanwhile, people are just a black box process that output words into their head, which they then take credit for, and calling it cognition. We have no idea how that black box that serves up a word when I say "Think of a car brand" works.

▲

ToucanLoucan 2 days ago | parent | next [-]

> because we have no idea how it works

Flagrantly, ridiculously untrue. We don't know the precise nuts and bolts regarding the emergence of consciousness and the ability to reason, that's fair, but different structures of the brain have been directly linked to different functions and have been observed in operation on patients being stimulated in various ways with machinery attached to them reading levels of neuro-activity in the brain, and in specific regions. We know which parts handle our visual acuity and sense of hearing, and even cooler, we can watch those same regions light up when we use our "minds eye" to imagine things or engage in self-talk, completely silent speech that nevertheless engages our verbal center, which is also engaged by the act of handwriting and typing.

In short: no, we don't have the WHOLE answer. But to say that we have no idea is categorically ridiculous.

As to the notion of LLMs doing similarly: no. They are trained on millions of texts of various sources of humans doing thinking aloud, and that is what you're seeing: a probabilistic read of millions if not billions of documents, written by humans, selected by the machine to "minimize error." And crucially, it can't minimize it 100%. Whatever philosophical points you'd like to raise about intelligence or thinking, I don't think we would ever be willing to call someone intelligent if they just made something up in response to your query, because they think you really want it to be real, even when it isn't. Which points to the overall charade: it wants to LOOK intelligent, while not BEING intelligent, because that's what the engineers who built it wanted it to do.

▲

lkey 2 days ago | parent | prev | next [-]

Accepting as true "We don't know how the brain works in a precise way" does not mean that obviously untrue statements about the human brain cannot still be made. Your brain specifically, however, is an electric rat that pulls on levers of flesh while yearning for a taste of God's holiest cheddar. You might reply, "no! that cannot be!", but my statement isn't untrue, so it goes.

▲

staticman2 2 days ago | parent | prev | next [-]

>>>There are no true and untrue claims about how the brain works, because we have no idea how it works.

Which is why if you pick up a neuroscience textbook it's 400 pages of blank white pages, correct?

There are different levels of understanding.

I don't need to know how a TV works to know there aren't little men and women acting out the TV shows when I put them on.

I don't need to know how the brain works in detail to know claims that humans are doing the same things as LLMs to be similarly silly.

▲

solumunus 2 days ago | parent | next [-]

The trouble is that no one knows enough about how the brain works to refute that claim.

▲

staticman2 2 days ago | parent [-]

There's no serious claim that needs refuting.

I don't think any serious person thinks LLMs work like the human brain.

People claiming this online aren't going around murdering their spouses like you'd delete an old LLama model from your hard drive.

I'm not sure why people keep posting these sorts of claims they can't possibly actually believe if we look at their demonstrable real life behavior.

▲

solumunus 2 days ago | parent [-]

We’re obviously more advanced than an LLM, but to claim that human beings simply generate output based on inputs and context (environment, life experience) is not silly.

> People claiming this online aren't going around murdering their spouses like you'd delete an old LLama model from your hard drive.

Not sure what you’re trying to say here.

▲

staticman2 2 days ago | parent [-]

I'm saying you'd object to being treated like an LLM and don't really have conviction when you make these claims.

I'd also say stringing together A.I. buzzwords (input output) to describe humans isn't really an argument so much as what philosophers call a category error.

	▲	solumunus 2 days ago \| parent [-]
		That I wouldn’t treat a human like an LLM is completely irrelevant to the topic. Input and output are not AI buzzwords, they’re fundamental terms in computation. The argument that human beings are computational has been alive in philosophy since the 1940’s brother…

▲

naasking 2 days ago | parent | prev [-]

> I don't need to know how the brain works in detail to know claims that humans are doing the same things as LLMs to be similarly silly.

Yes you do. It's all computation in the end, and isomorphisms can often be surprising.

▲

solumunus 2 days ago | parent | prev [-]

Our output is quite literally the sum of our hardware (genetics) and input (immediate environment and history). For anyone who agrees that free will is nonsense, the debate is already over, we’re nothing more than output generating biological machines.

▲

slightwinder 2 days ago | parent | prev [-]

Similar, not identical. Like a bicycle and car are both vehicles with tires, but are still not identical vessels.

▲

hitarpetar 2 days ago | parent | prev | next [-]

> We also just brute force "autocompletion" with our learned knowledge and combine it to new parts, which we then add to our learned knowledge to deepen the process

you know this because you're a cognitive scientist right? or because this is the consensus in the field?

▲

Psyladine 2 days ago | parent | prev [-]

>Neither can a toddler nor an animal. The level of ability is irrelevant for evaluating its foundation.

Its foundation of rational logical thought that can't process basic math? Even a toddler understands 2 is more than 1.

▲

ako 2 days ago | parent | prev | next [-]

An LLM by itself is not thinking, just remembering and autocompleting. But if you add a feedback loop where it can use tools, investigate external files or processes, and then autocomplete on the results, you get to see something that is (close to) thinking. I've seen claude code debug things by adding print statements in the source and reasoning on the output, and then determining next steps. This feedback loop is what sets AI tools apart, they can all use the same LLM, but the quality of the feedback loop makes the difference.

▲

DebtDeflation 2 days ago | parent | next [-]

>But if you add a feedback loop where it can use tools, investigate external files or processes, and then autocomplete on the results, you get to see something that is (close to) thinking

It's still just information retrieval. You're just dividing it into internal information (the compressed representation of the training data) and external information (web search, API calls to systems, etc). There is a lot of hidden knowledge embedded in language and LLMs do a good job of teasing it out that resembles reasoning/thinking but really isn't.

▲

ako 2 days ago | parent | next [-]

No, it's more than information retrieval. The LLM is deciding what information needs to be retrieved to make progress and how to retrieve this information. It is making a plan and executing on it. Plan, Do, Check, Act. No human in the loop if it has the required tools and permissions.

▲

naasking 2 days ago | parent | prev [-]

> LLMs do a good job of teasing it out that resembles reasoning/thinking but really isn't.

Given the fact that "thinking" still hasn't been defined rigourously, I don't understand how people are so confident in claiming they don't think.

▲

notepad0x90 2 days ago | parent [-]

reasoning might be a better term to discuss as it is more specific?

▲

naasking 2 days ago | parent [-]

It too isn't rigourously defined. We're very much at the hand-waving "I know it when I see it" [1] stage for all of these terms.

[1] https://en.wikipedia.org/wiki/I_know_it_when_I_see_it

▲

notepad0x90 18 hours ago | parent [-]

I can't speak for academic rigor, but it is very clear and specific from my understanding at least. Reasoning, simply put is the ability to come to a conclusion after analyzing information using a logic-derived deterministic algorithm.

▲

naasking 17 hours ago | parent [-]

* Humans are not deterministic.

* Humans that make mistakes are still considered to be reasoning.

* Deterministic algorithms have limitations, like Goedel incompleteness, which humans seem able to overcome, so presumably, we expect reasoning to also be able to overcome such challenges.

	▲	notepad0x90 5 hours ago \| parent [-]
		1) I didn't say we were, but when someone is called reasonable or acting with reason, then that implies deterministic/algorithmic thinking. When we're not deterministic, we're not reasonable. 2) Yes, to reason does imply to be infallible. The deterministic algorithms we follow are usually flawed. 3) I can't speak much to that, but I speculate that if "AI" can do reasoning, it would be a much more complex construct that uses LLMs (among other tools) as tools and variables like we do.

▲

assimpleaspossi 2 days ago | parent | prev | next [-]

>>you get to see something that is (close to) thinking.

Isn't that still "not thinking"?

	▲	ako 2 days ago \| parent [-]
		Depends who you ask, what their definition of thinking is.

▲

lossyalgo 2 days ago | parent | prev [-]

Just ask it how many r's are in strawberry and you will realize there isn't a lot of reasoning going on here, it's just trickery on top of token generators.

▲

Workaccount2 2 days ago | parent | next [-]

This is akin to "Show a human an optical illusion that exploits their physiology".

LLM's be like "The dumb humans can't even see the dots"[1]

[1]https://compote.slate.com/images/bdbaa19e-2c8f-435e-95ca-a93...

	▲	lossyalgo 2 days ago \| parent [-]
		haha that's a great analogy! How about non-determinism (i.e. hallucinations)? Ask a human ANY question 3 times and they will give you the same answer, every time, unless you prod them or rephrase the question. Sure the answer might be wrong 3 times, but at least you have consistency. Then again, maybe that's a disadvantage for humans!

▲

adrianmonk 2 days ago | parent | prev [-]

Ask an illiterate person the same thing and they will fail badly too. Is it impossible to have intelligence without literacy? (Bonus: if so, how was writing invented?)

	▲	lossyalgo a day ago \| parent [-]
		Yes but an illiterate person can be taught to read. Also LLMs generally fail (non-deterministically) at math in general, but humans can also be taught math.

▲

lossyalgo 2 days ago | parent | prev | next [-]

Furthermore regarding reasoning, just ask any LLM how many "r letters are in strawberry" - repeat maybe 3 times just to get a feeling for how much variance in answers you can get. And this "quirk" of the inability to get the right answer is something that after 2 years making fun of LLMs online on various forums is still an issue. The models aren't getting smarter, and definitely aren't thinking, they are still token generators with a few tricks on top to make them seem more intelligent than predecessors.

▲

EdNutting 2 days ago | parent | next [-]

Sorry to say, Claude Sonnet 4.5 (with and without extended thinking) has this one solved now. So, not “any” LLM, but maybe some still need work.

I presume they finally got around to training the internal reasoning loop into predicting when it needs to apply answer predictions character-by-character.

—- Transcripts from Claude Sonnet 4.5, initially with extended thinking:

> How many a’s in the phrase “artichokes and aardvarks”? Be sure to check your answer.

> Let me count the letter ‘a’ in “artichokes and aardvarks”: • artichokes: 1 ‘a’ (at the beginning) • and: 1 ‘a’ (at the beginning) • aardvarks: 3 ‘a’s (two at the beginning: “aa”, and one in the middle: “ardvarks”) Total: 1 + 1 + 3 = 5 ‘a’s Let me verify by marking them: artichokes and aardvarks Yes, there are 5 letter ‘a’s in the phrase “artichokes and aardvarks”.

> Are you sure that’s correct?

> Let me recount more carefully: “artichokes and aardvarks” Going through each word letter by letter: 1. artichokes - has 1 ‘a’ (first letter) 2. and - has 1 ‘a’ (first letter) 3. aardvarks - has 3 ‘a’s (first letter, second letter, and one more in the middle) Total: 1 + 1 + 3 = 5 ‘a’s Yes, I’m confident the answer is 5. Is there something specific you’d like me to recheck?

> How many r’s in strawberry?

> Let me count the letter ‘r’ in “strawberry”: s-t-r-a-w-b-e-r-r-y Going through each letter: • s: not an r • t: not an r • r: 1st r • a: not an r • w: not an r • b: not an r • e: not an r • r: 2nd r • r: 3rd r • y: not an r There are 3 r’s in “strawberry”.

<Without extended thinking> > How many u’s in universalulis?

> Looking at “universalulis” letter by letter: u-n-i-v-e-r-s-a-l-u-l-i-s There are 2 u’s in “universalulis”.

▲

jamincan 2 days ago | parent | prev [-]

For what it's worth, when I ask ChatGPT 5, it gets the correct answer every time. The response varies, but the answer is always three.

▲

ViewTrick1002 2 days ago | parent [-]

Now try a different language. My take is hard RL tuning to fix these "gotcha:s" since the underlying model can't do it on its own.

OpenAI is working on ChatGPT the application and ecosystem. They have transitioned from model building to software engineering with RL tuning and integration of various services to solve the problems the model can't do on its own. Make it feel smart rather than be smart.

This means that as soon as you find a problem where you step out of the guided experience you get the raw model again which fails when encountering these "gotchas".

Edit - Here's an example where we see a very tuned RL experience in English where a whole load of context is added on how to solve the problem while the Swedish prompt for the same word fails.

https://imgur.com/a/SlD84Ih

▲

ACCount37 2 days ago | parent [-]

You can tell it "be careful about the tokenizer issues" in Swedish and see how that changes the behavior.

The only thing that this stupid test demonstrates is that LLM metacognitive skills are still lacking. Which shouldn't be a surprise to anyone. The only surprising thing is that they have metacognitive skills, despite the base model training doing very little to encourage their development.

▲

lossyalgo a day ago | parent [-]

LLMs were not designed to count letters[0] since they work with tokens, so whatever trick they are now doing behind the scenes to handle this case, can probably only handle this particular case. I wonder if it's now included in the system prompt. I asked ChatGPT and it said it's now using len(str) and some other python scripts to do the counting, but who knows what's actually happening behind the scenes.

[0] https://arxiv.org/pdf/2502.16705

	▲	ACCount37 a day ago \| parent [-]
		There's no "trick behind the scenes" there. You can actually see the entire trick being performed right in front of you. You're just not paying attention. That trick? The LLM has succeeded by spelling the entire word out letter by letter first. It's much easier for an LLM to perform "tokenized word -> letters -> letter counts" than it is to perform "tokenized word -> letter counts" in one pass. But it doesn't know that! It copies human behavior from human text, and humans never had to deal with tokenizer issues in text! You can either teach the LLM that explicitly, or just do RLVR on diverse tasks and hope it learns the tricks like this by itself.

▲

IanCal 3 days ago | parent | prev | next [-]

> it's just auto-completing. It cannot reason

Auto completion just means predicting the next thing in a sequence. This does not preclude reasoning.

> I don't get why you would say that.

Because I see them solve real debugging problems talking through the impact of code changes or lines all the time to find non-obvious errors with ordering and timing conditions on code they’ve never seen before.

▲

notepad0x90 2 days ago | parent [-]

> This does not preclude reasoning.

It does not imply it either. to claim reasoning you need evidence. it needs to reliably NOT hallucinate results for simple conversations for example (if it has basic reasoning).

> Because I see them solve real debugging problems talking through the impact of code changes or lines all the time to find non-obvious errors with ordering and timing conditions on code they’ve never seen before.

Programming languages and how programs work are extensively and abundantly documented, solutions to problems and how to approach them,etc.. have been documented on the internet extensively. It takes all of that data and it completes the right text by taking the most correct path way based on your input. it does not actually take your code and debug it. it is the sheer volume of data it uses and the computational resources behind it that are making it hard to wrap your head around the difference between guessing and understanding. You too can look at enough stack overflow and (poorly) guess answers for questions without understanding anything about the topic and if you guess enough you'll get some right. LLMs are just optimized to get the amount of correct responses to be high.

	▲	IanCal 2 days ago \| parent [-]
		> It does not imply it either. Right, it's irrelevant to the question of whether they can reason. > to claim reasoning you need evidence Frankly I have no idea what most people are talking about when they use the term and say these models can't do it. It seems to be a similarly hand-wavey exercise as when people talk about thinking or understanding. > it needs to reliably NOT hallucinate results for simple conversations for example (if it has basic reasoning). That's not something I commonly see in frontier models. Again this doesn't seem related to reasoning. What we call hallucinations would be seen in something that could reason but had a fallible memory. I remember things incorrectly and I can reason. > it does not actually take your code and debug it It talks through the code (which it has not seen) and process step by step, can choose to add logging, run it, go through the logs, change what it thinks is happening and repeat. It can do this until it explains what is happening, creates test cases to show the problem and what triggers it, fixes it and shows the tests pass. If that's not debugging the code I really don't know what to call it.

▲

logicchains 2 days ago | parent | prev | next [-]

>I don't get why you would say that. it's just auto-completing. It cannot reason. It won't solve an original problem for which it has no prior context to "complete" an approximated solution with. you can give it more context and more data,but you're just helping it complete better. it does not derive an original state machine or algorithm to solve problems for which there are no obvious solutions. it instead approximates a guess (hallucination).

I bet you can't give an example such written problem that a human can easily solve but no LLM can.

▲

xanderlewis 3 days ago | parent | prev | next [-]

> I don't get why you would say that.

Because it's hard to imagine the sheer volume of data it's been trained on.

▲

utopiah 2 days ago | parent [-]

And because ALL the marketing AND UX around LLMs is precisely trying to imply that they are thinking. It's not just the challenge of grasping the ridiculous amount of resources poured in, which does including training sets, it's because actual people are PAID to convince everybody those tools are actually thinking. The prompt is a chatbox, the "..." are there like a chat with a human, the "thinking" word is used, the "reasoning" word is used, "hallucination" is used, etc.

All marketing.

	▲	xanderlewis a day ago \| parent [-]
		You're right. Unfortunately, it seems that not many are willing to admit this and be (rightly) impressed by how remarkably effective LLMs can be, at least for manipulating language.

▲

madaxe_again 3 days ago | parent | prev | next [-]

The vast majority of human “thinking” is autocompletion.

Any thinking that happens with words is fundamentally no different to what LLMs do, and everything you say applies to human lexical reasoning.

One plus one equals two. Do you have a concept of one-ness, or two-ness, beyond symbolic assignment? Does a cashier possess number theory? Or are these just syntactical stochastic rules?

I think the problem here is the definition of “thinking”.

You can point to non-verbal models, like vision models - but again, these aren’t hugely different from how we parse non-lexical information.

▲

gloosx 2 days ago | parent | next [-]

> Any thinking that happens with words is fundamentally no different from what LLMs do.

This is such a wildly simplified and naive claim. "Thinking with words" happens inside a brain, not inside a silicon circuit with artificial neurons bolted in place. The brain is plastic, it is never the same from one moment to the next. It does not require structured input, labeled data, or predefined objectives in order to learn "thinking with words." The brain performs continuous, unsupervised learning from chaotic sensory input to do what it does. Its complexity and efficiency are orders of magnitude beyond that of LLM inference. Current models barely scratch the surface of that level of complexity and efficiency.

> Do you have a concept of one-ness, or two-ness, beyond symbolic assignment?

Obviously we do. The human brain's idea of "one-ness" or "two-ness" is grounded in sensory experience — seeing one object, then two, and abstracting the difference. That grounding gives meaning to the symbol, something LLMs don't have.

▲

gkbrk 2 days ago | parent | next [-]

LLMs are increasingly trained on images for multi-modal learning, so they too would have seen one object, then two.

▲

gloosx 2 days ago | parent [-]

They never saw any kind of object, they only saw labeled groups of pixels – basic units of a digital image, representing a single point of color on a screen or in a digital file. Object is a material thing that can be seen and touched. Pixels are not objects.

▲

gkbrk 2 days ago | parent | next [-]

Okay, goalpost has instantly moved from seeing to "seeing and touching". Once you feed in touch sensor data, where are you going to move the goalpost next?

Models see when photons hit camera sensors, you see when photons hit your retina. Both of them are some kind of sight.

	▲	gloosx a day ago \| parent [-]
		The difference between photons hitting the camera sensors and photons hitting the retina is immense. With a camera sensor, the process ends in data: voltages in an array of photodiodes get quantized into digital values. There is no subject to whom the image appears. The sensor records but it does not see. When photons hit the retina, the same kind of photochemical transduction happens — but the signal does not stop at measurement. It flows through a living system that integrates it with memory, emotion, context, and self-awareness. The brain does not just register and store the light, it constructs an experience of seeing, a subjective phenomenon — qualia. Once models start continuously learning from visual subjective experience, hit me up – and I'll tell you the models "see objects" now. Until direct raw photovoltaic information stream about the world around them without any labelling can actually make model to learn anything, they are not even close to "seeing".

▲

madaxe_again 2 days ago | parent | prev [-]

My friend, you are blundering into metaphysics here - ceci n’est pas une pipe, the map is the territory, and all that.

We are no more in touch with physical reality than an LLM, unless you are in the habit of pressing your brain against things. Everything is interpreted through a symbolic map.

	▲	gloosx a day ago \| parent [-]
		when photons strike your retina, they are literally striking brain tissue that is been pushed outward into the skull front window. Eyes are literally the brain, so yes, we are pressing it against things to "see" them.

▲

madaxe_again 2 days ago | parent | prev [-]

The instantiation of models in humans is not unsupervised, and language, for instance, absolutely requires labelled data and structured input. The predefined objective is “expand”.

See also: feral children.

	▲	gloosx a day ago \| parent [-]
		Children are not shown pairs like "dog": [object of class Canine] They infer meaning from noisy, ambiguous sensory streams. The labels are not explicit, they are discovered through correlation, context, and feedback. So although caregivers sometimes point and name things, that is a tiny fraction of linguistic input, and it is inconsistent. Children generalize far beyond that. Real linguistic input to a child is incomplete, fragmented, error-filled, and dependens on context. It is full of interruptions, mispronunciations, and slang. The brain extracts structure from that chaos. Calling that "structured input" confuses the output - inherent structure of language - with the raw input, noisy speech and gestures. The brain has drives: social bonding, curiosity, pattern-seeking. But it doesn't have a single optimisation target like "expand." Objectives are not hardcoded loss functions, they are emergent and changing. You're right that lack of linguistic input prevents full language development, but that is not evidence of supervised learning. It just shows that exposure to any language stream is needed to trigger the innate capacity. Both complexity and efficiency of the human learning is just on another level. Transformers are child's play compared to that level. They are not going to gain consciousness, and no AGI will happen in the foreseeable future, it is all just marketing crap, and it's becoming more and more obvious as the dust settles.

▲

notepad0x90 2 days ago | parent | prev [-]

We do a lot of autocompletion and LLMs overlap with that for sure. I don't know about the "vast majority" even basic operations like making sure we're breathing or have the right hormones prompted are not guesses but deterministic algorithmic ops. Things like object recognition and speech might qualify as autocompletion. But let's say you need to setup health-monitoring for an application. that's not an autocomplete operation. you must evaluate various options, have opinions on it, weigh priorities,etc.. in other words, we do autocompletion but even then the autocompletion is a basic building block or tool we use in constructing more complex decision logic.

If you train an animal to type the right keys on a keyboard that generates a hello world program, you didn't just teach them how to code. they just memorized the right keys that lead to their reward. a human programmer understands the components of the code, the intent and expectations behind it, and can reason about how changes would affect outcomes. the animal just knows how the reward can be obtained most reliably.

▲

naasking 2 days ago | parent | prev | next [-]

> don't get why you would say that. it's just auto-completing.

https://en.wikipedia.org/wiki/Predictive_coding

> If it thought and reasoned, it would know to ignore those and focus on the variables and input at hand to produce deterministic and consistent output

You only do this because you were trained to do this, eg. to see symmetries and translations.

▲

Kichererbsen 3 days ago | parent | prev | next [-]

Sure. But neither do you. So are you really thinking or are you just autocompleting?

When was the last time you sat down and solved an original problem for which you had no prior context to "complete" an approximated solution with? When has that ever happened in human history? All the great invention-moment stories that come to mind seem to have exactly that going on in the background: Prior context being auto-completed in an Eureka! moment.

	▲	notepad0x90 2 days ago \| parent [-]
		I think (hah) you're understimating what goes on when living things (even small animals) think. We use auto-compleition for some tasks, but it is a component of what we do. Let's say your visual system auto-completes some pattern and detects a snake while you're walking, that part is auto-completion. You will probably react by freezing or panicing, that part is not auto-compleition, it is a deterministic algorithm. But then you process the detected object, auto-compleiting again to identify it as just a long cucumber. But again, the classification part is auto-completion. What will you do next? "Hmm, free cucumber, i can cook with it for a meal" and you pick it up. auto-completion is all over that simple decision, but you're using results of auto-completion to derive association (food), check your hunger level (not auto-completion), determine that the food is desirable and safe to eat (some auto-compleition), evalute what other options you have for food (evaluate auto-complete outputs), and then instruct your nervous system to pick it up. We use auto-compleition all the time as an input, we don't reason using auto-compleition in other words. You can argue that if all your input is from auto-completion (it isn't) then it makes no difference. But we have deterministic reasoning logical systems that evaluate auto-completion outputs. if your cucumber detection identified it as rotten cucumber, then decision that it is not safe to eat is not done by auto-completion but a reasoning logic that is using auto-completion output. You can approximate the level of rot but once you recognize it as rotten, you make decision based on that information. You're not approximating a decision, you're evaluating a simple logic of: if(safe()){eat();}. Now amp that up to solving very complex problems. what experiments will you run, what theories will you develop, what R&D is required for a solution,etc.. these too are not auto-completions. an LLM would auto complete these and might arrive at the same conclusion most of the time. but our brains are following algorithms we developed and learned over time where an LLM is just expanding on auto-completion but with a lot more data. In contrast, our brains are not trained on all the knowledge available on the public internet, we retain a tiny miniscule of that. we can arrive at similar conclusions as the LLM because we are reasoning and following algorithms matured and perfected over time. The big take away should be that, as powerful as LLMs are now, if they could reason like we do, they'd dominate us and become unstoppable. Because their auto-completion is many magnitudes better than ours, if they can write new and original code based on an understanding of problem solving algorithms, that would be gen ai. We can not just add 1 + 1 but prove that the addition operation is correct mathematically. and understand that when you add to a set one more object, the addition operation always increments. We don't approximate that, we always, every single time , increment because we are following an algorithm instead of choosing the most likely correct answer.

▲

jiggawatts 2 days ago | parent | prev | next [-]

You wrote your comment one word at a time, with the next word depending on the previous words written.

You did not plan the entire thing, every word, ahead of time.

LLMs do the same thing, so... how is your intelligence any different?

▲

ben_w 2 days ago | parent | next [-]

A long time ago I noticed that I sometimes already had a complete thought before my inner monologue turned it into words. A few times I tried skipping the inner monologue because I'd clearly already thought the thought. Turns out the bit of my brain that creates the inner monologue from the thought, can generate a sense of annoyance that the rest of my brain can feel.

Not that it matters, there's evidence that while LLMs output one word at a time, they've got forward-planning going on, having an idea of the end of a sentence before they get there.

	▲	rcxdude 2 days ago \| parent [-]
		Indeed, and it seems like they would really struggle to output coherent text at all if there was not some kind of pre-planning involved (see how even humans struggle with it in games where you have to construct a sentance by having each person shout out one word at a time). Even GPT-2 likely had at least some kind of planning for the next few words in order to be as coherent as it was.

▲

lossyalgo 2 days ago | parent | prev [-]

Tell that to German-speakers, where the verb comes last, and the order of things in sentences is not anything like English, therefore requiring you to think of the entire sentence before you just spit it out. Even the numbers are backwards (twenty-two is two-and-twenty) which requires thinking.

Furthermore, when you ask an LLM to count how many r's are in the word strawberry, it will give you a random answer, "think" about it, and give you another random answer. And I guarantee you out of 3 attempts, including reasoning, it will flip-flop between right and wrong, but unlike a human, it will be random, because, unlike humans who, when asked "how many r's are in the word strawberry" will not be able to tell you the correct answer every. fucking. time.

edit: formatting

▲

pka 2 days ago | parent | next [-]

It seems models are pre-planning though:

> How does Claude write rhyming poetry? Consider this ditty:

> He saw a carrot and had to grab it,

> His hunger was like a starving rabbit

> To write the second line, the model had to satisfy two constraints at the same time: the need to rhyme (with "grab it"), and the need to make sense (why did he grab the carrot?). Our guess was that Claude was writing word-by-word without much forethought until the end of the line, where it would make sure to pick a word that rhymes. We therefore expected to see a circuit with parallel paths, one for ensuring the final word made sense, and one for ensuring it rhymes.

> Instead, we found that Claude plans ahead. Before starting the second line, it began "thinking" of potential on-topic words that would rhyme with "grab it". Then, with these plans in mind, it writes a line to end with the planned word.

[https://www.anthropic.com/research/tracing-thoughts-language...]

▲

nxor 2 days ago | parent | prev [-]

The part about strawberry is just not right. That problem was solved. And I do think it's a stretch to say German speakers think of the entire sentence before speaking it.

	▲	lossyalgo 2 days ago \| parent [-]
		LLMs were not designed to count letters[0] since they work with tokens, so whatever trick they are now doing behind the scenes to handle this case, can probably only handle this particular case. I wonder if it's now included in the system prompt. I asked ChatGPT and it said it's now using len(str) and some other python scripts to do the counting, but who knows what's actually happening behind the scenes. [0] https://arxiv.org/pdf/2502.16705

▲

2 days ago | parent | prev [-]

[deleted]