Remix.run Logo
SEGyges 2 days ago

fortunately i wrote an entire post about what the difference is between the parts of this that it is easy to make sense of and the parts of it that it is prohibitively difficult to make sense of and it was posted on hackernews

sirwhinesalot 2 days ago | parent | next [-]

Your article, unlike the bizarre desperate take from the poster above, is actually very good. We do not understand the features the neural net learns, that's 100% true (and really the whole point of them in the first place).

For small image recognition models we can visualize them and get an intuition for what they are doing, but it doesn't really matter.

For even smaller models, we can translate them to a classical AI model (like a mixed integer program as an example) and actually do various "queries" on the model itself to, e.g., learn that the network recognizes the number "8" by just checking 2 pixels in the image.

None of this changes the fact that we know what these things are and how they work, because we built them. Any comparisons to our lack of knowledge of the human brain are ridiculous. LLMs are obviously not conscious, they don't even have real "state", they're an approximated pure function f(context: List<Token>) -> Token, that's run in a loop.

The only valid alarmist take is that we're using black box algorithms to make decisions with serious real-world impact, but this is true of any black box algorithm, not just the latest and greatest ML models.

dpoloncsak 2 days ago | parent | next [-]

Its a complex adaptive system, right? Isn't that the whole idea? We know how each part of the system works by itself. We know all inputs, and can measure outputs.

I still (even if I actually understood the math) cannot tell you 'If you prompt 'x', the model will return 'y' with 100% confidence.

sirwhinesalot 2 days ago | parent [-]

> If you prompt 'x', the model will return 'y' with 100% confidence.

We can do this for smaller models. Which means it's a problem of scale/computing power rather than a fundamental limitation. The situation with the human brain is completely different. We know neurons exchange information and how that works, and we have a pretty good understanding of the architecture of parts of the brain like the visual cortex, but we have no idea of the architecture as a whole.

We know the architecture of an LLM. We know how the data flows. We know what it is the individual neurons are learning (cuts and bends of a plane in a hyperdimensional space). We know how the weights are learned (backpropagation). We know the "algorithm" the LLM as a whole is approximating (List<Token> -> Token). Yes there are emergent properties we don't understand but the same is true of a spam filter.

Comparing this to our lack of understanding of the human brain and discussing how these things might be "conscious" is silly.

ninetyninenine 2 days ago | parent [-]

>Comparing this to our lack of understanding of the human brain and discussing how these things might be "conscious" is silly.

Don't call my claim silly. I'm sick of your attitude. Why can't you have a civil discussion?

Literally we don't know. You can't make a claim that it's silly when you can't even define what consciousness is. You don't know how human brains work, you don't know how consciousness forms, you don't know how emergence in LLMs work. So your claim here is logically just made up out of thin air.

Sure we "understand" LLMs from the curve fitting perspective. But the entirety of why we use LLMs and what we use it for arises from the emergence which is what we don't understand. Curve fitting is like 1% of the LLM, it is the emergent properties we completely don't get (99%) and take advantage of on a daily basis. Curve fitting is just a high level concept that allows us to construct the algorithm which is the actual thing that does the hard work of wiring up the atomic units of the network.

>Yes there are emergent properties we don't understand but the same is true of a spam filter.

Yeah and so? Your statement proves nothing. It just illustrates a contrast in sentiment. The spam filter is a trivial thing, the human brain is not.

We don't understand the spam filter. And this is the most interesting part of it all is that the SAME scaling problem that prevents us from understanding the spam filter can be characterized as the reason that prevents us from understanding BOTH the LLM and the human brain.

Your statement doesn't change anything. It's just using sentiment to try to re-characterize a problem in a different light.

ninetyninenine 2 days ago | parent | prev [-]

https://youtu.be/qrvK_KuIeJk?t=284

I don’t appreciate your comments. Especially rude to call me desperate and bizarre.

Take a look at the above video where Geoffrey Hinton basically the god father of AI directly contradicts your statement.

I sincerely hope you self reflect and are able to realize that you’re the one completely out of it.

Realistically the differences get down to sort of a semantic issue. We both agree that there are things we don’t understand and things that we do understand. It’s just the overall aggregate generalization of this in your opinion comes down to: “we overall do understand” and mine is “we don’t understand shit”

Again. Your aggregate is wrong. Utterly. Preeminent Experts are on my side. If we did understand LLMs we’d be able to edit the individual weights of each neuron to remove hallucinations. But we can’t. Like literally we know a solution to the hallucination problem exists. It’s in the weights. We know a certain configuration of weights can remove the hallucination. But even for a single prompt and answer pair we do not know how to modify the weights such that the hallucinations go away. We can’t even quantify, formally define or model what an hallucination is. We describe LLMs in human terms and we manipulate the thing through prompts and vague psychological methods like “chain of thought”.

You think planes can work like this? Do we psychologically influence planes to sort of fly correctly?

Literally. No other engineered system like this exists on earth where sheer lack of understanding is this large.

sirwhinesalot 2 days ago | parent [-]

Sorry for the remark, it was indeed unnecessarily rude and I apologise.

That said your appeal to authority means nothing to me. I can simply counter with another appeal to authority like Yann LeCun who thinks LLMs are an evolutionary dead end (and I agree).

It matters not that we cannot comprehend them (in the sense of predicting what output it'll give for a given input). Doing that is their job. I also can't comprehend why a support vector machine ends up categorizing spam the way it does.

In both cases we understand the algorithms involved which is completely different from our understanding of the brain and its emergent properties.

ninetyninenine 2 days ago | parent [-]

Apology not accepted. Your initial take was to cast me as out of touch. With my appeal to authority you now realize that my stance occupies a very valid place even though you disagreed. Like it wasn’t just rude, statements like “bizarre” are inaccurate given that Geoffrey agrees with me. So even if I’m not offended by your rudeness it’s not a bizarre take at all. It’s a valid take.

That being said. Yan Lecunn is not in agreement with you either. His only claim is that LLMs are not agi and that hallucinations on LLMs can never be removed.

The debate here isn’t even about that. The debate here is that we don’t understand LLMs. Whether the LLM is agi or whether we can remove or never remove the hallucinations is COMPLETELY orthogonal.

So you actually can’t counter with another appeal to authority. Either way I didn’t “just” appeal to authority. I literally logically countered every single one of your statements as well.

sirwhinesalot 2 days ago | parent [-]

There's a huge jump from "we cannot predict the output of an LLM given its input" to "we don't understand LLMs", or that they might be conscious or that this is in any way equivalent to our lack of understanding of the human brain.

We also don't understand (in that sense) any other ML model of sufficient size. It learning features we humans cannot come up with is its job. We can understand (in that sense) sufficiently small models because we have enough computational power to translate them to a classical AI model and query it.

That means it is a problem of scale, not of some fundamental property unique to LLMs.

The bizarre take is being spooked by this. It's been true of simpler models for a very long time. Not a problem.

ninetyninenine 2 days ago | parent [-]

>There's a huge jump from "we cannot predict the output of an LLM given its input" to "we don't understand LLMs", or that they might be conscious or that this is in any way equivalent to our lack of understanding of the human brain.

No it's not. There's huge similarities between artificial neural networks and the human brain. We not only understand atoms. We understand individual biological neurons. So the problem of understanding the human brain is in actuality ALSO a scaling problem. Granted I realize the human brain is much more complex in terms of network connections and how it rewires dynamically, but my point still stands.

Additionally we can't even characterize the meaning of consciousness. Like you're likely thinking consciousness is some sort of extremely complex or very powerful concept. But the word is loaded and we don't know so much that we actually don't know this. Consciousness could be a very trivial thing, we actually have no idea.

I agree that the brain is much more complex and much harder to understand and we understand much less. But this does not detract from the claim above that we fundamentally don't understand the LLM to such a degree that we can't even make a statement about whether or not an LLM is conscious or not. To reiterate PART of this comes from the fact that we ALSO don't understand what consciousness is itself.

>The bizarre take is being spooked by this. It's been true of simpler models for a very long time. Not a problem.

This is an hallucination by you. I'm not spooked at all. I don't know wwhere you're getting that from. My initial post, the tone was one of annoyance not "spooked". I'm annoyed by all the claims from people like you saying "we completely understand LLMs".

I mean doesn't this show how similar you are to an LLM? You hallucinated that I was spooked when I indicated no such thing. I think here's a more realistic take: You're spooked. If what I said was categorically true, than you'd be spooked by the implications so part of what you do is to choose the most convenient reality that's within the realm of possibility such that you aren't spooked.

Like I understand that classifying consciousness as this trivial thing that can possibly come about as an emergent side effect in an LLM could be a spooky thing. But think rationally. Given how much we don't know both about LLMs, human brains and consciousness, we in ACTUALITY don't know if this is what's going on. We can't make a statement either way. And this is the most logical explanation. It has NOTHING to do with being "spooked" which is an attribute that shouldn't be part of any argument.

sirwhinesalot 2 days ago | parent [-]

Hackernews really isn't a good place for a serious discussion, so I'll just clarify my position.

I think you're spooked for the same reason I think that all the "AI alarmists" whose alarmism is based on our lack of understanding of LLMs are spooked. That because we "lack understanding" it follows that AI is "out of our control" or is on the verge of becoming "conscious" or "intelligent", whatever that means.

Except this isn't true to me. Yes, we can't predict how inputs will map to outputs, but that's nothing unexpected? This has been true of nearly every ML model in practical use (not just those based on neural nets) for a very long time.

I don't perceive this as a "lack of understanding", in the same way I don't consider it a "lack of understanding" the inability to predict the output of a Support Vector Machine classifying email as spam, or not being able to predict how the coefficients of a radial basis function end up accurately approximating the behavior of a complex physical system. To me they're all a "lack of interpretability", which is a different thing.

This is, to me, qualitatively different from our lack of understanding of the human brain. We know the algorithm an LLM is executing, because we set it up. We know how it learns, because we invented the algorithm that does it. We understand pretty well what's happening between the neurons because it's just a scaled up version of smaller models, whose behavior we have visualized and understand pretty well. We know how it "reasons" (in the sense of "thinking" models) because we set it up to "reason" in that matter from how we trained it.

Our understanding of the human brain is not even close to this. We can't even understand the most basic of brains.

Even postulating that LLMs are conscious, whatever that actually is in reality, is nonsensical. They're not even alive! What would "consciousness" even entail for a pure function? There's no reason to even bring that up other than to hype these things as more than what they are (be it positively or negatively).

> I think the fact of the matter is, if you're putting your foot down and saying LLMs aren't intelligent... you're wildly illogical and misinformed about the status quo of Artificial intelligence

They're just as intelligent as a chess engine is intelligent. They're algorithms.

> Also the characterization in the article is mistaken. It says we understand LLMs in a limited way. Yeah sure. It's as limited as our understanding of the human brain.

We understand enough about how they work that we know just forcing them to output more tokens leads to better results and we have a good intuition as to why (see: Karpathy's video on the subject). It's why when asked a math question they spit out a whole paragraph rather than the answer directly, and why "reasoning" is surprisingly effective (we can see from open models that reasoning often just spits out a giant pile of nonsense). More tokens = more compute = more accuracy. A bit similar to the number of noise removal steps in a diffusion model.

ninetyninenine 2 days ago | parent [-]

>I think you're spooked for the same reason I think that all the "AI alarmists" whose alarmism is based on our lack of understanding of LLMs are spooked. That because we "lack understanding" it follows that AI is "out of our control" or is on the verge of becoming "conscious" or "intelligent", whatever that means.

Yeah, well you made this shit up out of thin fucking air. I'm not spooked. We lack understanding of it, but it doesn't mean we can't use it. It doesn't mean there's going to be a skynet level apocalypse. You notice I didn't say anything like that? There's literally no evidence for it AND i never said anything like that. Here's what I think, I don't know how an airplane works. And I'm fine with it, I still can ride an airplane. I also don't know how an LLM works. I'm also fine with it. It just so happens, nobody knows how an LLM works. I'm also fine with that.

This spooked bullshit came out of your own hallucination. You made that shit up. My initial post is NOT meant to be alarmist. It's meant to be fucking annoyed at people who want to utterly deny everything to be totally simple and we totally get it. The fact of the matter is, we may not understand it, but I don't think anything catastrophic is going to emerge from the fact we don't understand it. Even if the LLM is sentient I don't think there's much to fear.

However this doesn't mean that what the alarmists say isn't real. We just don't know.

>Except this isn't true to me. Yes, we can't predict how inputs will map to outputs, but that's nothing unexpected? This has been true of nearly every ML model in practical use (not just those based on neural nets) for a very long time.

Doesn't change the fact we don't fucking know what's going on. Like I said. This is something you're spooked about IF what I said was true. I'm not spooked about it period. Your adding shit to the topic that's OFF topic.

>I don't perceive this as a "lack of understanding", in the same way I don't consider it a "lack of understanding" the inability to predict the output of a Support Vector Machine classifying email as spam, or not being able to predict how the coefficients of a radial basis function end up accurately approximating the behavior of a complex physical system. To me they're all a "lack of interpretability", which is a different thing.

This isn't a perception problem. It's not as if you perceive something in a different way suddenly your perception is valid. NO. We Categorically DO NOT understand it. Stop playing with words. Lack of understanding IS lack of interpretability. It's the same fucking thing. If you can't interpret what happened you don't understand what happened.

Maybe what you're trying to say here is that we understand LLMs enough in such a way that you aren't spooked. Since you made up all that bullshit about me being spooked, I'm guessing that's what you mean. But the fact of the matter remains the same: We UNDERSTAND LESS about LLMs then what we currently know.

>This is, to me, qualitatively different from our lack of understanding of the human brain. We know the algorithm an LLM is executing, because we set it up. We know how it learns, because we invented the algorithm that does it. We understand pretty well what's happening between the neurons because it's just a scaled up version of smaller models, whose behavior we have visualized and understand pretty well. We know how it "reasons" (in the sense of "thinking" models) because we set it up to "reason" in that matter from how we trained it.

Sure there are differences. That's obvious. But the point is we STILL don't understand LLMs in essence. That is still true despite your comparison here.

>Our understanding of the human brain is not even close to this. We can't even understand the most basic of brains.

If we understand 1 percent of LLMs but only 0.1% of the human brain. That's a 10x dramatic increase in our understanding of LLMs OVER the brain. But it still doesn't change my main point: Overall we. don't. understand. how. LLMs. work. This is exactly the way I would characterize our overall understanding holistically.

>Even postulating that LLMs are conscious, whatever that actually is in reality, is nonsensical. They're not even alive! What would "consciousness" even entail for a pure function? There's no reason to even bring that up other than to hype these things as more than what they are (be it positively or negatively).

Your statement is in itself nonsensical because you don't even know what consciousness or being alive even means. Like there are several claims here made about things you don't know about: LLMs and human brains made using words you can't even define: "alive" and "conciousness". Like the rational point by point thing you need to realize is that you're not rationally constructing your claim from logic. You're not saying we know A therefore B must be true. <--- that is how you construct an argument.

While I'm saying you're making claim A, using B, C and E but we don't know anything about B, C and E so your claim is baseless. You get it? We don't know.

>They're just as intelligent as a chess engine is intelligent. They're algorithms.

But you don't understand how the emergent of effects of the algorithm works so you can't make the claim that they are as intelligent as a chess engine. See? You make claim A and I said your claim A is based on fact B but B is something you don't know anything about. Can you counter this? No.

>We understand enough about how they work that we know just forcing them to output more tokens leads to better results and we have a good intuition as to why (see: Karpathy's video on the subject). It's why when asked a math question they spit out a whole paragraph rather than the answer directly, and why "reasoning" is surprisingly effective (we can see from open models that reasoning often just spits out a giant pile of nonsense). More tokens = more compute = more accuracy. A bit similar to the number of noise removal steps in a diffusion model.

This is some trivial ball park understanding that is clearly equivalent to overall NOT understanding. You're just describing something like curve fitting again.

sirwhinesalot 2 days ago | parent [-]

> Maybe what you're trying to say here is that we understand LLMs enough in such a way that you aren't spooked. Since you made up all that bullshit about me being spooked, I'm guessing that's what you mean.

Correct. I bundled you with the alarmists who speak in similar ways, as pattern matching brains tend to do. Not an hallucination in the LLM sense, just standard probability.

> If we understand 1 percent of LLMs but only 0.1% of the human brain. That's a 10x dramatic increase in our understanding of LLMs OVER the brain. But it still doesn't change my main point: Overall we. don't. understand. how. LLMs. work. This is exactly the way I would characterize our overall understanding holistically.

And it's not how I characterize it at all. What algorithm is your brain running right now? Any idea? We have no clue. We know the algorithm the LLM is executing: it's a token prediction engine running in a loop. We wrote it. We know enough about how it works to know how to make it better (e.g., Mixture of Experts, "Reasoning").

This is not a "0.1x" or "10x" or whatever other quantitative difference, it's a qualitative difference in understanding. Not being able to predict the input-output relationship of any sufficiently large black-box algorithm does not give one carte-blanche to jump to conclusions regarding what they may or may not be.

How large does a black-box model need to be before you entertain that it might be "conscious" (whatever that may actually be). Is a sufficiently large spam filter conscious? Is it even worth entertaining such an idea? Or is it just worth entertaining for LLMs because they write text that is sufficiently similar to human written text? Does this property grant them enough "weight" that questions regarding "consciousness" are even worth bringing up? What about a starcraft playing bot based on reinforcement learning? Is it worth bringing up for one? We "do not understand" how they work either.

ninetyninenine 2 days ago | parent [-]

>Correct. I bundled you with the alarmists who speak in similar ways, as pattern matching brains tend to do. Not an hallucination in the LLM sense, just standard probability.

First off what you did here is common for humans.

Second off it's the same thing that happens for LLMs. You don't know fact from fiction, neither does the LLM, so it predicts something probable given limited understanding. It is not different. You made shit up. You hallucinated off of a probable outcome. LLMs do the same.

Third, as a human, it's on you when you don't verify the facts. You make shit up on accident that's your fault and your reputation on the line. It's justified here to call you out for making crap up out of thin air.

Either make better guesses or don't guess at all. For example this guess: "Maybe what you're trying to say here is that we understand LLMs enough in such a way that you aren't spooked." was spot on by your own admission.

>And it's not how I characterize it at all. What algorithm is your brain running right now? Any idea? We have no clue. We know the algorithm the LLM is executing: it's a token prediction engine running in a loop. We wrote it. We know enough about how it works to know how to make it better (e.g., Mixture of Experts, "Reasoning").

This has nothing to do with quantization, that's just an artifact of the example I'm using and is only there to illustrate relative differences in the amount we know.

Your characterization is that we know MUCH more about the LLM than we do the brain. So I'm illustrating that, while, yeah EVEN though your characterization is true THE amount we know about the LLM is still miniscule. Hence the 10x improvement on 1% from 0.1%. In the end we still don't know shit, it's still at most 1% of what we need to know. Quantization isn't the point, it wasn't your point, it's not mine. It's here to illustrate proportion of knowledge WHICH was indeed your POINT.

>How large does a black-box model need to be before you entertain that it might be "conscious" (whatever that may actually be). Is a sufficiently large spam filter conscious?

I don't know. You don't know either. We both don't know. Because like I said we don't even know what the word means.

>Is it even worth entertaining such an idea?

Probably not for a spam box filter. But technically speaking We. don't. actually. know.

However, qualitatively speaking it is worth Entertaining the idea for an LLM Given how similar it is to humans. We both understand WHY plenty of people are entertaining the idea. Right? you and I totally get it. What I'm saying is that GIVEN that we don't know either way, we can't dismiss what other people are entertaining.

Also your method of rationalizing all of this is flawed. Like you, use comparisons to justify your thoughts. You don't want to think a spam filter is sentient so you think if the spam filter is comparable to an LLM then we must think an LLM isn't sentient. But that doesn't logically flow right? How is a spam filter similar to an LLM? There are differences right? Just because they share similarities doesn't make your argument suddenly logically flow. There are similarities between spam filters and humans too! We both use neural nets? Therefore since spam filters aren't sentient, humans aren't either? Do you see how this line of reasoning can be fundamentally misapplied everywhere?

I mean the comparison logic is flawed, ON top of the fact that we don't even know what we're talking about... i mean... What is consciousness? And we don't in actuality understand the spam filter enough to know if it's sentient. I mean if ONE aspect of your logic made sense we could possibly say I'm just being pedantic that certain assumptions are given... but your logic is broken everywhere. Nothing works. So I'm not being pedantic.

>Or is it just worth entertaining for LLMs because they write text that is sufficiently similar to human written text?

Yes. Many people would agree. It's worth entertaining. This "worth" is a human measure. Not just qualitative, but also opinionated and it is of my opinion and many peoples opinion that "yes" it is worth it. Hence why there's so much debate around it. Even if you don't feel it's worth "entertaining" at least you have the intelligence to understand why so many people think it's worth it to discuss.

>What about a starcraft playing bot based on reinforcement learning? Is it worth bringing up for one? We "do not understand" how they work either.

Most people are of the opinion that "no" it is not worth understanding. It is better to ask this question of the LLM. Of course you bring up these examples because you think the comparison chains everything together. You think if it's not worth it for the spam filter it's not worth it in your mind to consider sentience for anything that is in your opinion "comparable" to it. And like I deduced earlier I'm saying, you're wrong, this type of logic doesn't work.

polotics 2 days ago | parent | prev [-]

Coud you provide a link so we can follow your thread of thought on this? It appears your article got submitted by another user than you.