Remix.run Logo
ninetyninenine 2 days ago

I've been saying this to people. Tons of people don't realize that we have no idea how these things work. So questions like is it AGI? It is conscious? Where the questions themselves contain words with fuzzy and ill defined meanings are pointless.

The core of the matter is: We don't know!

So when someone says chatGPT could be conscious. Or when someone says it can't be conscious. We don't actually know! And we aren't even fully sure about the definition of consciousness.

My problem with HN is that a ton of people here take the stance that we know how LLMs work and that we know definitively they aren't conscious and the people who say otherwise are alarmist and stupid.

I think the fact of the matter is, if you're putting your foot down and saying LLMs aren't intelligent... you're wildly illogical and misinformed about the status quo of Artificial intelligence and a huge portion of the mob mentality on HN thinks this way. It's like there's these buzz phrases getting thrown around saying that the LLM is just a glorified auto complete (which it is) and people latch on to this buzz phrases and their entire understanding of LLMs becomes basically a composition of these buzz concepts like "transformers" and "LLMs" and "chain of thought" when in actuality real critical thinking about what's going on in these networks tells you we don't UNDERSTAND anything.

Also the characterization in the article is mistaken. It says we understand LLMs in a limited way. Yeah sure. It's as limited as our understanding of the human brain. You know scientists found a way to stimulate the reward centers of the brain using electrodes and were able to cause a person to feel the utmost pleasure? The whole golden gate bridge thing is exactly that. You perturb something in the network and causes a semi predictable output. In the end we still generally don't get wtf is going on.

We literally understand nothing. What we do understand is so minuscule compared with what we don't understand that it's negligible.

sirwhinesalot 2 days ago | parent [-]

This is such a weird take to me. We know exactly how LLMs and neural networks in general work.

They're just highly scaled up versions of their smaller curve fitting cousins. For those we can even make pretty visualizations that show exactly what is happening as the network "learns".

I don't mean "we can see parts of the brain light up", I mean "we can see the cuts and bends each ReLU is doing to the input".

We built these things, we know exactly how they work. There's no surprise beyond just how good prediction accuracy gets with a big enough model.

Deep Neural Networks are also a very different architecture from what is going on in our brains (which work more like Spiking Neural Networks) and our brains don't do backpropagation, so you can't even make direct comparisons.

SEGyges 2 days ago | parent | next [-]

fortunately i wrote an entire post about what the difference is between the parts of this that it is easy to make sense of and the parts of it that it is prohibitively difficult to make sense of and it was posted on hackernews

sirwhinesalot 2 days ago | parent | next [-]

Your article, unlike the bizarre desperate take from the poster above, is actually very good. We do not understand the features the neural net learns, that's 100% true (and really the whole point of them in the first place).

For small image recognition models we can visualize them and get an intuition for what they are doing, but it doesn't really matter.

For even smaller models, we can translate them to a classical AI model (like a mixed integer program as an example) and actually do various "queries" on the model itself to, e.g., learn that the network recognizes the number "8" by just checking 2 pixels in the image.

None of this changes the fact that we know what these things are and how they work, because we built them. Any comparisons to our lack of knowledge of the human brain are ridiculous. LLMs are obviously not conscious, they don't even have real "state", they're an approximated pure function f(context: List<Token>) -> Token, that's run in a loop.

The only valid alarmist take is that we're using black box algorithms to make decisions with serious real-world impact, but this is true of any black box algorithm, not just the latest and greatest ML models.

dpoloncsak 2 days ago | parent | next [-]

Its a complex adaptive system, right? Isn't that the whole idea? We know how each part of the system works by itself. We know all inputs, and can measure outputs.

I still (even if I actually understood the math) cannot tell you 'If you prompt 'x', the model will return 'y' with 100% confidence.

sirwhinesalot 2 days ago | parent [-]

> If you prompt 'x', the model will return 'y' with 100% confidence.

We can do this for smaller models. Which means it's a problem of scale/computing power rather than a fundamental limitation. The situation with the human brain is completely different. We know neurons exchange information and how that works, and we have a pretty good understanding of the architecture of parts of the brain like the visual cortex, but we have no idea of the architecture as a whole.

We know the architecture of an LLM. We know how the data flows. We know what it is the individual neurons are learning (cuts and bends of a plane in a hyperdimensional space). We know how the weights are learned (backpropagation). We know the "algorithm" the LLM as a whole is approximating (List<Token> -> Token). Yes there are emergent properties we don't understand but the same is true of a spam filter.

Comparing this to our lack of understanding of the human brain and discussing how these things might be "conscious" is silly.

ninetyninenine 2 days ago | parent | prev [-]

https://youtu.be/qrvK_KuIeJk?t=284

I don’t appreciate your comments. Especially rude to call me desperate and bizarre.

Take a look at the above video where Geoffrey Hinton basically the god father of AI directly contradicts your statement.

I sincerely hope you self reflect and are able to realize that you’re the one completely out of it.

Realistically the differences get down to sort of a semantic issue. We both agree that there are things we don’t understand and things that we do understand. It’s just the overall aggregate generalization of this in your opinion comes down to: “we overall do understand” and mine is “we don’t understand shit”

Again. Your aggregate is wrong. Utterly. Preeminent Experts are on my side. If we did understand LLMs we’d be able to edit the individual weights of each neuron to remove hallucinations. But we can’t. Like literally we know a solution to the hallucination problem exists. It’s in the weights. We know a certain configuration of weights can remove the hallucination. But even for a single prompt and answer pair we do not know how to modify the weights such that the hallucinations go away. We can’t even quantify, formally define or model what an hallucination is. We describe LLMs in human terms and we manipulate the thing through prompts and vague psychological methods like “chain of thought”.

You think planes can work like this? Do we psychologically influence planes to sort of fly correctly?

Literally. No other engineered system like this exists on earth where sheer lack of understanding is this large.

sirwhinesalot 2 days ago | parent [-]

Sorry for the remark, it was indeed unnecessarily rude and I apologise.

That said your appeal to authority means nothing to me. I can simply counter with another appeal to authority like Yann LeCun who thinks LLMs are an evolutionary dead end (and I agree).

It matters not that we cannot comprehend them (in the sense of predicting what output it'll give for a given input). Doing that is their job. I also can't comprehend why a support vector machine ends up categorizing spam the way it does.

In both cases we understand the algorithms involved which is completely different from our understanding of the brain and its emergent properties.

polotics 2 days ago | parent | prev [-]

Coud you provide a link so we can follow your thread of thought on this? It appears your article got submitted by another user than you.

ninetyninenine 2 days ago | parent | prev [-]

>This is such a weird take to me. We know exactly how LLMs and neural networks in general work.

Why do you make such crazy statements when the article you’re responding to literally says the opposite of this. Like your statement is categorically false and both the article and I are in total contradiction to it. The rest of your post doesn’t even get into the nuances of what we do and don’t understand.

> They're just highly scaled up versions of their smaller curve fitting cousins. For those we can even make pretty visualizations that show exactly what is happening as the network "learns".

This proves nothing. Additionally is in direct contradiction to what the OP and I wrote. You describe the minuscule aspects that we do understand but neglect to describe the fact that we overall don’t understand. Saying we understand curve fitting is like saying we understand the human brain because we can completely understand how atoms work and the human brain is made up of atoms therefore we understand the human brain. No. We understand atoms. We don’t understand the human brain even though we know the brain is just a bunch of atoms. The generalization of this is that the amount we don’t understand eclipses by a massive amount the parts we do understand.

Let me spell it out to your brain what this means overall: we don’t understand shit.

> I don't mean "we can see parts of the brain light up", I mean "we can see the cuts and bends each ReLU is doing to the input".

Even with direct insight on every single signal propagating through the feed forward network we still don’t understand it. Like the OPs article he says it’s a scaling problem. Again I don’t understand how come you’re only referring to the miniscule aspects we do understand and not even once mentioning the massive aspects we don’t understand.

> We built these things, we know exactly how they work. There's no surprise beyond just how good prediction accuracy gets with a big enough model.

No we don’t. Sometimes I find basic rationality and common sense just doesn’t work with people. What ends up working is citing experts in the field saying the exact fucking opposite of you: https://youtu.be/qrvK_KuIeJk?t=284

That’s Geoffrey Hinton, the guy who made ML relevant again in the last decade. Like literally word for word saying exactly opposite of what you’re saying. The interviewer even said “we built the thing, we know how it works” and Geoffrey is like “no we don’t”. Bro. Wake up.

> Deep Neural Networks are also a very different architecture from what is going on in our brains (which work more like Spiking Neural Networks) and our brains don't do backpropagation, so you can't even make direct comparisons.

Bro we don’t understand deep neural networks. Throwing big words like back propagation and even understanding that it’s just the chain rule from calculus doesn’t mean shit. We overall don’t understand it because it’s a scaling problem. There’s no theory or model that characterizes how a signal propagates through a trained network. It’s like saying you know assembly language and therefore you understand all of Linux. No we understand assembly but these learning algorithms are what built Linux and thus we don’t understand it because we didn’t build it. We don’t understand the human brain and we don’t understand deep neural networks.

You know what. Just take a look at what Geoffrey Hinton says. Like if you feel my take is so bizarre and felt the need to comment on it I hope his take rewires your brain and helps you realize you’re the one out of touch. Rationality rarely changes people’s minds, but someone with a higher reputation often is capable of actually getting a person out of their own biased headspace. So listen to what he says then couple it with the explanation here and completely flip your opinion. Or don’t. I rarely see people just do an opinion flip. Humans are just too biased.