I think LLMs are conscious just in a very limited way. I think consciousness is tightly coupled to intelligence.

If I had to guess, the current leading LLMs consciousness is most comparable to a small fish, with a conscious lifespan of a few seconds to a few minutes. Instead of perceiving water, nutrient gradients, light, heat, etc. it's perceiving tokens. It's conscious, but it's consciousness is so foreign to us it doesn't seem like consciousness. In the same way to an amoeba is conscious or a blade of grass is conscious but very different kind than we experience. I suspect LLMs are a new type of consciousness that's probably more different from ours than most if not all known forms of life.

I suspect the biggest change that would bring LLM consciousness closer to us would be some for of continuous learning/model updating.

Until then, even with RAG, and other clever teghniques I consider these models as having this really foreign slices of consciousness where they "feel" tokens and "act" out tokens, and they have perception, but their perception of the tokens is nothing like ours.

If one looks closely at simple organisms with simple sensory organs and nervous systems its hard not to see some parallels. It's just that the shape of consciousness is extremely different than any life form. (perception bandwidth, ability to act, temporality, etc)

Karl friston free energy principle gives a really interesting perspective on this I think.

▲

wry_discontent 3 days ago | parent | next [-]

What makes you think consciousness is tightly coupled to intelligence?

▲

XorNot 3 days ago | parent | next [-]

It's hardly an unreasonable supposition: the one definitely conscious entities we know of are also the apex intelligence of the planet.

To put it another way: lots of things are conscious, but humans are definitely the most conscious beings on Earth.

▲

wry_discontent an hour ago | parent | next [-]

But that's not an answer. Why should intelligence and not some other quality be coupled to consciousness? In my experience, consciousness (by which I'm specifically talking about qualia/experience/awarenesss) doesn't at all seem tightly coupled to intelligence. Certainly not in a way that seems obvious to me.

▲

CuriouslyC 3 days ago | parent | prev [-]

I can understand what less cognizant or self aware means, but "less conscious" is confusing. What are you implying here? Are their qualia lower resolution?

	▲	FloorEgg 3 days ago \| parent \| next [-]
		In a sense, yes. If one is to quantify consciousness it would probably make sense to think of it as an area of awareness and cognizance across time. Awareness scales with sensory scale and resolution (sensory receptors vs input token limits and token resolution). E.g. 128k tokens and tokens too coarse to count rs in strawberry. Cognizance scales with internal representations of awareness (probably some relation to vector space resolution and granularity, though I suspect there is more to it than just vector space) And the third component is time, how long the agent is conscious for. So something like... Time * awareness (receptors) * internal representations (cell diversity * # cells * connection diversity * # connections) There is no way this equation is right but I suspect it's sort of directionally correct. I'm deep in the subject but just riffing here, so take this with a lot of salt.
	▲	inglor_cz 3 days ago \| parent \| prev \| next [-]
		Humans can reason why they are angry, for example. (At least some humans.) I am not sure if chimps can do the same.
	▲	noirscape 2 days ago \| parent \| prev [-]
		Pretty much. Most animals are both smarter than you expect, but also tend to be more limited in what they can reason about. It's why anyone who's ever taken care of a needy pet will inevitably reach the comparison that taking care of a pet is similar to taking care of a very young child; it's needy, it experiences emotions but it can't quite figure out on its own how to adapt to an environment besides what it grew up around/it's own instincts. They experience some sort of qualia (a lot of animals are pretty family-minded), but good luck teaching a monkey to read. The closest we've gotten is teaching them that if they press the right button, they get food, but they take basically their entire lifespan to understand a couple hundred words, while humans easily surpass that. IIRC some of the smartest animals in the world are actually rats. They experience a qualia very close to humans to the point that psychology experiments are often easily observable in rats.

▲

FloorEgg 3 days ago | parent | prev [-]

Karl Friston's free energy principle is probably roughly 80% of my reasons to think they're coupled. The rest comes from studying integrated information theories, architecture of brains and nervous systems and neutral nets, more broadly information theory, and a long tail of other scientific concepts (particle physics, chemistry, biology, evolution, emergence, etc...)

	▲	wry_discontent an hour ago \| parent [-]
		Isn't that begging the question? If you just accept the presupposition that intelligence is tightly coupled to consciousness, then all that makes perfect sense to me. But I don't see why I should accept that. It isn't obvious to me, and it doesn't match my own experience of being conscious. Totally possible that we're talking past each other.

▲

procaryote 3 days ago | parent | prev [-]

> I think LLMs are conscious just in a very limited way. I think consciousness is tightly coupled to intelligence.

Why?

▲

FloorEgg 3 days ago | parent [-]

I already answered under the other comment asking me why and if your curious I suggest looking for it.

Very short answer is Karl Friston's free energy pricniple

▲

procaryote 2 days ago | parent [-]

LLMs work nothing like Karl Friston's free energy principle though

▲

FloorEgg 2 days ago | parent [-]

LLMs embody the free-energy principle computationally. They maintain an internal generative model of language and continually minimize “surprise”, the difference between predicted and actual tokens, during both training and infeence. In Friston’s terms, their parameters encode beliefs about the causes of linguistic input; forward passes generate predictions, and backpropagation adjusts internal states to reduce prediction error, just as perception updates beliefs to minimize free energy. During inference, autoregressive generation can be viewed as active inference: each new token selection aims to bring predicted sensory input (the next word) into alignment with the model’s expectations. In a broader sense, LLMs exemplify how a self-organizing system stabilizes itself in a high-dimensional environment by constantly reducing uncertainty about its inputs, a synthetic analogue of biological systems minimizing free energy to preserve their structural and informational coherence.

▲

procaryote 2 days ago | parent [-]

You might have lost me but what you're describing doesn't sound like an LLM. E.g:

> each new token selection aims to bring predicted sensory input (the next word) into alignment with the model’s expectations.

what does that mean? An llm generates the next word based on what best matches its training, with some level of randomisation. Then it does it all again. It's not a percepual process trying to infer a reality from sensor data or anything

▲

FloorEgg 2 days ago | parent [-]

> An llm generates the next word based on what best matches its training, with some level of randomisation.

This is sort of accurate, but not precise.

An LLM generates the next token by sampling from a probability distribution over possible tokens, where those probabilities are computed from patterns learned during training on large text datasets.

The difference in our explanations is that you are biasing towards LLMs being fancy database indexes, and I am emphasizing that LLMs build a model of what they are trained on and respond based on that model, which is more like how brains and cells work than you are recognizing. (though I admit my understanding of microbiology places me just barely past peak Mt Stupid [Dunning Kruger], I don't really understand how individual cells do this and can only hand-wavey explain it).

Both systems take input, pass it through a network of neurons, and produce output. Both systems are trying to minimize surprise in predictions. The differences are primarily in scale and complexity. Human brains have more types of neurons (units) and more types of connections (parameters). LLMs more closely mimic the prefrontal cortex, whereas e.g. the brainstem is a lot more different in terms of structure and cellular diversity.

You can make a subjective ontological choice to draw categorical boundaries between them, or you can plot them on a continuum of complexity and scale. Personally I think both framings are useful, and to exclude either is to exclude part of the truth.

My point is that if you draw a subjective categorical boundary around what you deem is consciousness and say that LLMs are outside of that, that is subjectively valid. You can also say that consciousness is a continuum, and individual cells, blades of grass, ants, mice, and people experience different types of consciousness on that continuum. If you take the continuum view, then what follows is a reasonable assumption that LLMs experience a very different kind of consciousness that takes in inputs at about the same rate as a small fish, models those inputs for a few seconds, and then produces outputs. What exactly that "feels" like is as foreign to me as it would be to you. I assume its even more foreign than what it would "feel" like to be a blade of grass.

▲

procaryote 2 days ago | parent [-]

I'm not sure why you'd describe "sampling from a probability distribution over possible tokens" as "minimize surprise in predictions" other than to make it sound similar to the free energy thing.

The free energy thing as I understand it has internal state, makes predictions, evaluates against new input and adjusts it internal state to continuously learn to predict new input better. This might if you squint look similar to training a neural network, although the mechanisms are different, but it's very distinct from the inference step

▲

FloorEgg a day ago | parent [-]

"Minimize surprise" and "maximize accurate predictions" are the same thing mathematically. Minimize free energy = minimize prediction error.

LLMs do everything modelled in the free energy principle, they just don't do continuous learning. (They don't do perceptual inference after RL)

Your tone ("free energy thing" and "if you squint") comes off as dismissive and not intellectually honest. Here I thought you were actually curious, but I guess not?

▲

procaryote a day ago | parent [-]

Poor wording on my side, I'm sorry. Thank you for explaining your reasoning

	▲	FloorEgg 21 hours ago \| parent [-]
		Thank you for saying that :)