| ▲ | hodgehog11 6 days ago |
| I am sympathetic to the reasoning as to why LLMs should not be used to help some programmers right now. But I get a little frustrated seeing many of these kinds of posts that talk about fundamental limitations of LLMs vs humans on the grounds that it cannot "logically reason" like a human does. These are limitations in the current approach to training and objectives; internally, we have no clue what is going on. > it’s “just a statistical model” that generates “language” based on a chain of “what is most likely to follow the previous phrase” Humans are statistical models too in an appropriate sense. The question is whether we try to execute phrase by phrase or not, or whether it even matters what humans do in the long term. > The only way ChatGPT will stop spreading that nonsense is if there is a significant mass of humans talking online about the lack of ZSTD support. Or you can change the implicit bias in the model by being more clever with your training procedure. This is basic stats here, not everything is about data. > They don’t know anything, they don’t think, they don’t learn, they don’t deduct. They generate real-looking text based on what is most likely based on the information it has been trained on. This may be comforting to think, but it's just wrong. It would make my job so much easier if it were true. If you take the time to define "know", "think", and "deduct", you will find it difficult to argue current LLMs do not do these things. "Learn" is the exception here, and is a bit more complex, not only because of memory and bandwidth issues, but also because "understand" is difficult to define. |
|
| ▲ | Barrin92 6 days ago | parent | next [-] |
| >Humans are statistical models too in an appropriate sense. No, we aren't and I'm getting tired of this question begging and completely wrong statement. Human beings are capable of what Kant in fancy words called "transcendental apperception", we're already bringing our faculties to bear on experience without which the world would make no sense to us. What that means in practical terms for programming problems of this kind is that, we can say "I don't know", which the LLM can't, because there's no "I", in the LLM, no unified subject that can distinguish what it knows and what it doesn't, what's within its domain of knowledge or outside. >If you take the time to define "know", "think", and "deduct", you will find it difficult to argue current LLMs do not do these things No, only if you don't spend the time to think about what knowledge is you'd make such a statement. What enables knowledge, which is not raw data but synthesized, structured cognition, is the faculties of the mind a priori categories we bring to bear on data. That's why these systems are about as useless as a monkey with a typewriter when you try to have them work on manual memory management in C, because that's less of a task in auto completion and requires you to have in your mind a working model of the machine. |
| |
| ▲ | hodgehog11 6 days ago | parent | next [-] | | This is interesting philosophy, and others have better critiques here in that regard. I'm a mathematician, so I can only work in what I can define symbolically. Humans most certainly ARE statistical models by that definition: without invoking the precise terminology, we take input, yield output, and plausibly involve uncertain elements. One can argue as to whether this is the correct language or not, but I prefer to think this way, as the arrogance of human thinking has otherwise failed us in making good predictions about AI. If you can come up with a symbolic description of a deficiency in how LLMs approach problems, that's fantastic, because we can use that to alter how these models are trained, and how we approach problems too! > What that means in practical terms for programming problems of this kind is that, we can say "I don't know", which the LLM can't, because there's no "I", in the LLM, no unified subject that can distinguish what it knows and what it doesn't, what's within its domain of knowledge or outside. We seriously don't know whether there is an "I" that is comprehended or not. I've seen arguments either way. But otherwise, this seems to refer to poor internal calibration of uncertainty, correct? This is an important problem! (It's also a problem with humans too, but I digress). LLMs aren't nearly as bad as this as you might think, and there are a lot of things you can do (that the big tech companies do not do) that can better tune it's own self-confidence (as reflected in logits). I'm not aware of anything that uses this information as part of the context, so that might be a great idea. But on the other hand, maybe this actually isn't as important as we think it is. | | |
| ▲ | Peritract 6 days ago | parent [-] | | > I'm a mathematician, so I can only work in what I can define symbolically. This is a limitation of you, not an argument. Of course everything will look the same to you if that's the only way you can represent them. We have more disciplines than mathematics because mathematically is not the only valid way to explore things. | | |
| ▲ | hodgehog11 6 days ago | parent [-] | | Yes, of course. I'm not trying to establish further argument or to criticise. But it isn't fruitful to have a conversation where both parties have different definitions in mind. Since my original comment referred to the mathematical definition, it is important for me to clarify that. The rest of my comment is to try to find some middle ground, of which there is plenty in philosophy. If the raw theory of mind arguments for humans were already applicable for recognizing the limitations of LLM designs, we would be much further along than we currently are. |
|
| |
| ▲ | lblume 6 days ago | parent | prev | next [-] | | The position of Kant does not align with the direction modern neuroscience is heading towards. Current evidence seems to prefer decentralized theories of consciousness like Dennett's multiple drafts model[1], suggesting there is no central point where everything comes together to form conscious experience, but instead that it itself is constituted by collaborative processes that have multiple realizations. [1]: https://en.wikipedia.org/wiki/Multiple_drafts_model | | |
| ▲ | Barrin92 6 days ago | parent [-] | | >Current evidence seems to prefer decentralized theories of consciousness like Dennett There is no such thing as consciousness in Dennett's theory, his position is that it doesn't exist, he is a Eliminativist. This is of course an absurd position with no evidence for it as people like Chalmers have pointed out (including in that Wikipedia article), and it might be the most comical and ideological position in the last 200 years. | | |
| ▲ | EnergyAmy 6 days ago | parent [-] | | From the link: > However, Dennett is not denying the existence of the mind or of consciousness, only what he considers a naive view of them It doesn't seem like he's Eliminativist. It also seems like the criticisms rely on harping on about qualia, which is one of the sillier schools of sophistry. I'd need to see actual criticisms before believing that Dennett is pushing for something comical. |
|
| |
| ▲ | rileymat2 6 days ago | parent | prev | next [-] | | It is kind of interesting you are arguing with a human about the ability to distinguish what is known and not known. Where your claim is humans know and LLMs don’t, but the op is wrong and does not know what he does not know. | |
| ▲ | HDThoreaun 6 days ago | parent | prev | next [-] | | Kant was a dualist, of course he didnt think humans were statistical models. It just turns out he was (probably) wrong. | |
| ▲ | elliotto 6 days ago | parent | prev | next [-] | | Does a chess engine know, think, or deduct? | |
| ▲ | anuramat 5 days ago | parent | prev [-] | | > because there's no "I", in the LLM, no unified subject is there in humans? > Human beings are capable of are they? > without which the world would make no sense to us does it? |
|
|
| ▲ | raincole 6 days ago | parent | prev | next [-] |
| While the normal distribution meme is notoriously overused, I think it fits the scenario here. LLMs know so much (when you just use ChatGPT for the first time like it's an Oracle machine) -> LLMs don't know anything (when you understand how machine learning works) -> LLMs know so much (when you actually think about what 'know' means) |
|
| ▲ | efilife 6 days ago | parent | prev | next [-] |
| > it cannot "logically reason" like a human does Reason? Maybe. But there's one limitation that we currently have no idea how to overcome; LLMs don't know how much they know. If they tell you they don't something it may be a lie. If they tell you they do, this may be a lie too. I, a human, certainly know what I know and what I don't and can recall from where I know the information |
| |
| ▲ | vidarh 6 days ago | parent | next [-] | | I have never met a human who has a good grasp of what they know and don't know. They may have a better graps of it than an LLM, but humans are awfully bad at understanding the limits of our own knowledge, and will argue very strongly in favour of knowing more than we demonstrably do in all kinds of contexts. | | |
| ▲ | bwfan123 6 days ago | parent | next [-] | | > I have never met a human who has a good grasp of what they know and don't know yep. There are 2 processes underlying our behaviors. 1) a meme-copier which takes in information from various sources and regurgitates it. A rote-memory machine of sorts. Here, memory is populated with info without deconstructing it into a theory. This has been termed "know-that". Here, explanations are shallow, and repeated questioning of why will fail with: I was told so. 2) a builder which attempts to construct a theory of something. Here, a mechanism is built and understood, and information is deconstructed in the context of the mechanism. This has been termed "know-how". Here, explanations are deeper, and repeated questioning of why will end up with: these are the "givens" in our theory. Problem is that we operate in the "know-that" territory most of the time, and have not taken the effort to build theories for ourselves in the "know-how" territory. | |
| ▲ | ModernMech 6 days ago | parent | prev | next [-] | | LLMs are not humans, they are ostensibly tools. Tools are supposed to come with a list of things they can do. LLMs don’t and are therefore bad at being tools, so we anthropomorphize them. But they are also not good at being people, so LLMs are left in this weird in-between state where half the people say they’re useful and half the people say they cause more problems than they solve. | | |
| ▲ | vidarh 6 days ago | parent [-] | | They are fantastic tools to me. The provide abilities that no other tools we have can provide. That they also have flaws does not remove those benefits. | | |
| ▲ | ModernMech 6 days ago | parent [-] | | There's no question they have capabilities that no other tool has. But a good tool goes beyond just doing something, there are some generally agreed upon principles of tool design that make something good versus just useful. For example, I think a hammer is a good tool because every time I swing it at a nail, it provides the force necessary to drive it into the wood. It's reliable. Sure, sometimes a hammer breaks, but my baseline expectation in using one is that every time I swing the hammer it will behave the same way. Something more complicated, like a Rust compiler is also a good tool in the same way. It's vastly more intricate than the hammer, yet it still has the good tool property of being reliable; every time I press compile, if the program is wrong, then the compiler tells me that. If it's right, then the compiler passes every time. It doesn't lie, it doesn't guess, it doesn't rate limit, it doesn't charge a subscription, it doesn't silently update causing code to fail, it informs when changes are breaking and what they are, it allows me to pick my version and doesn't silently deprecate me without recourse, etc. There are of course ecosystems out there where building a project is more like a delicate dance, or a courtship ritual, and those ecosystems are pain in the ass to deal with. I'm talking XKCD #1987, or NodeJS circa 2014, or just the entire rationale behind Docker. People exit their careers to not have to deal with such technology, because it's like working at the DMV or living in Kafka's nightmares. LLMs are more in that direction, and no one is going to like where we end up if we make them our entire stack, as seems to be the intent by the powers that be. There's a difference between what LLMs are and what they're being sold as. For what they are, they can be useful and may one day they will be turned into good tools if some of the major flaws are fixed. On the other hand, we are in the process of totally upending the way our industry works on the basis of what these things will be, which they are selling as essentially an oracle. "The smartest person who know in your pocket", "a simultaneous expert PhD, MD, JD", "Smarter than all humans combined". But there's a giant gulf between what they're selling and what it is, and that gulf is what makes LLMs a poor tool. | | |
| ▲ | vidarh 6 days ago | parent [-] | | They are good tools to me. We won't agree on this. The provide abilities no other tool provides me with. They could be better, but they've still provided me with possibilities that I have never had before without hiring humans. | | |
| ▲ | ModernMech 6 days ago | parent [-] | | I'm sure they're good for you, I'm not suggesting otherwise. What I'm saying is, if you ask 100 engineers to describe the properties of the best tools they use, the set of adjectives and characteristics come up with will largely not be applicable to LLMs, and ChatGPT 5 doesn't change that. | | |
| ▲ | vidarh 6 days ago | parent [-] | | This is pure, unsubstantiated conjecture. It's also wildly unrealistic conjecture, in my opinion. The first, and most important, measure to me of a good tool is whether it makes me more productive. LLMs does. Measurably so. You will certainly find people who don't believe LLMs do for them, but that won't change the fact that for a lot of us it is an immensely good tool. And a tool doesn't need to fit everyones processes to be a good tool. | | |
| ▲ | ModernMech 5 days ago | parent [-] | | > It's also wildly unrealistic conjecture, in my opinion. The only property of good tools you've mentioned that LLMs have is they do something useful. But are they reliable? No, they inexplicably work sometimes and don't work other times. Do they have a clear purpose? No, their purpose is muddled and it's sold as being capable of literally everything. Are the ergonomics good? No, the interface is completely opaque, accessed only behind a natural language text box that comes with no instructions on how to use. But then it must be intuitive, right? No, common wisdom on how to use them sound more like astrological forecasts, or the kind of advice you give people trying to get something from a toddler -- "If you want good results, first you have to get into character!"... etc. etc. > it makes me more productive. Awesome! I'm sure they are doing exactly what you say and your experiences with them are amazing. But yours and my personal productivity isn't the question facing the industry at the moment or the topic of this discussion. The question isn't whether you personally as an induvial find these things useful in your process. If that were the question I wouldn't be here complaining about them. I'm here because the powers that be are telling us that we must adopt these things into every aspect of our work lives as soon as possible, and if we don't, we deserve to be left behind. THAT is what I'm here to talk about. > And a tool doesn't need to fit everyones processes to be a good tool. I haven't argued this. Of course a tool doesn't have to fit everyone's process to be good, but we have some generally accepted principles of good tool design, and LLMs don't follow them. That doesn't preclude you from using them well in your process. |
|
|
|
|
|
| |
| ▲ | Jensson 6 days ago | parent | prev [-] | | > I have never met a human who has a good grasp of what they know and don't know But humanity has a good grasp over it, we even created science to solve this. So out of all existences we are aware of humanity is by far the best at this, nothing else even comes remotely close. | | |
| ▲ | vidarh 6 days ago | parent [-] | | No, we really don't. We have major disagreements about what we know and don't know all the time. Like right now. | | |
| ▲ | efilife 5 days ago | parent [-] | | A LLM needs to be told what it knows. You don't It can never with reasonable accuracy say "I don't know" as a human would | | |
| ▲ | vidarh 5 days ago | parent [-] | | And humans are often wrong both when we say we don't know, and when we claim to know. There likely is a difference in degree of accuracy, but the point I was making was that despite your claim to "certainly know what [you] know", we don't in fact know what we know with anything remotely near precision. We know some of what we know, but we can both be pressed into doing things we are certain we don't know how to do but where the knowledge is still there, and we will confidently proclaim to know things (such as the extent of our knowledge) that we don't. I will agree that LLMs need to acquire a better idea of what they know, but there is no reason to assume that knowing the limits of your own knowledge with any serious precision matters, given how bad humans are at knowing this. So much of human culture, politics, and civil life is centered around resolving conflicts that arise out of our lack of knowledge of our own limits, that this uncertainty is fairly central to what it means to be human. |
|
|
|
| |
| ▲ | AaronAPU 6 days ago | parent | prev | next [-] | | I’m afraid that sense of knowing what you know is very much illusory for humans as well. Everyone is just slowly having to come to terms with that. | | |
| ▲ | efilife 6 days ago | parent [-] | | No. I can tell you what skills I possess. For example: programming, writing music. A LLM can not do this unless it's told what it knows |
| |
| ▲ | hodgehog11 6 days ago | parent | prev | next [-] | | You are judging this based on what the LLM outputs, not on its internals. When we peer into its internals, it seems that LLMs actually have a pretty good representation of what they do and don't know; this just isn't reflected in the output because the relevant information is lost in future context. | |
| ▲ | lblume 6 days ago | parent | prev | next [-] | | Do you really know what you don't know? This would rule out unknown unknowns entirely. | | |
| ▲ | add-sub-mul-div 6 days ago | parent | next [-] | | Yes, it's not that people know specifically what they don't know, it's that they develop the wisdom to know those boundaries and anticipate them and reduce their likelihood and impact. For example, if I use the language of my expertise for a familiar project then the boundaries where the challenges might lie are known. If I start learning a new language for the project I won't know which areas might produce unknowns. The LLM will happily give you code in a language it's not trained well on. With the same confidence as using any other language. | |
| ▲ | efilife 6 days ago | parent | prev [-] | | Sorry for copypasting from another comment but this is relevant I can tell you what skills I possess. For example: programming, writing music. A LLM can not do this unless it's told what it knows. I could also tell you whether I studied thing X or attempted to do it and what success I had. So I'm pretty good at assessing my abilities. A LLM has no idea | | |
| ▲ | lblume 6 days ago | parent [-] | | This is interesting, because I wouldn't be too sure about that. Whether I am able to play chess well exclusively depends on the strengths of my opponents, because there is no absolute baseline. If society somehow decided that what you were writing wasn't "music" anymore, you would be the only person left stating you had that skill. I believe that most claims about one's own skills do come from outward judgement and interpretation of one's actions, not introspection. The only thing humans have is, to radically appropriate the jargon, a way longer context window (spanning over an entire lifetime!), together with many ways to compress it. |
|
| |
| ▲ | gallerdude 6 days ago | parent | prev | next [-] | | > OpenAI researcher Noam Brown on hallucination with the new IMO reasoning model: > Mathematicians used to comb through model solutions because earlier systems would quietly flip an inequality or tuck in a wrong step, creating hallucinated answers. > Brown says the updated IMO reasoning model now tends to say “I’m not sure” whenever it lacks a valid proof, which sharply cuts down on those hidden errors. > TLDR, the model shows a clear shift away from hallucinations and toward reliable, self‑aware reasoning. Source: https://x.com/chatgpt21/status/1950606890758476264 | |
| ▲ | mrcartmeneses 6 days ago | parent | prev [-] | | Socrates would beg to differ |
|
|
| ▲ | libraryofbabel 6 days ago | parent | prev | next [-] |
| Yeah. The empty “it’s just a statistical model” critique (or the dressed-up “stochastic parrots” version of it) is almost a sign at this point that the person using it formed their opinions about AI back when ChatGPT first came out, and hasn’t really bothered to engage with it much since then. If in 2022 I’d tried to convince AI skeptics that in three years we might have tools on the level of Claude Code, I’m sure I’d have heard everyone say it would be impossible because “it’s just a statistical model.” But it turned out that there was a lot more potential in the architecture for encoding structured knowledge, complex reasoning, etc., despite that architecture being probabilistic. (Don’t bet against the Bitter Lesson.) LLMs have a lot of problems, hallucination still being one of them. I’d be the first to advocate for a skeptical hype-free approach to deploying them in software engineering. But at this point we need careful informed engagement with where the models are at now rather than cherry-picked examples and rants. |
| |
| ▲ | seba_dos1 6 days ago | parent | next [-] | | Unless what you work on is very simple and mostly mindless, using tools like Claude Code is the exact opposite of how to make the current SotA LLMs useful for coding. The models can help and boost your productivity, but it doesn't happen by letting them do more stuff autonomously. Quite the contrary. And when what you usually work on actually is very simple and mostly mindless, you'd probably benefit more from doing it yourself, so you can progress beyond the junior stuff one day. | | |
| ▲ | structural 6 days ago | parent | next [-] | | Where it really has value is if what you work on is like 33% extremely difficult and 66% boilerplate/tedious. Being able to offload the tedious bits can make more senior engineers 2-3x more productive without the coordination effort of "find a junior engineer to do this, schedule their time, assign the work, follow up on it". (The problem of course is you still need these junior engineers to exist in order to have the next generatino of senior engineers, so we must now also think about what our junior folks should be doing to be valuable AND learn). | | | |
| ▲ | anuramat 5 days ago | parent | prev [-] | | "real programmers use ed", 2025 edition: you're forever stuck with "junior stuff" if you let a language model handle language god forbid I don't have to read 10k lines of logs to fix a typo |
| |
| ▲ | Jensson 6 days ago | parent | prev | next [-] | | > If in 2022 I’d tried to convince AI skeptics that in three years we might have tools on the level of Claude Code, I’m sure I’d have heard everyone say it would be impossible because “it’s just a statistical model.” We already had these coding models in 2022, they were already pattern matching engines with variables back then, all your imagination needed to do to go from that to Claude code today is to give it more code examples and make it bigger. They still can't replace even a junior engineers ability to navigate tasks over even short periods of time, just like back then they need constant handholding to get anything done. So I don't see what changed except the model being larger with more examples and therefore you can get larger chunks of coherent code out of them. | |
| ▲ | vidarh 6 days ago | parent | prev [-] | | People repeating the "stochastic parrot" meme in all kinds of variations if anything appear to be more like stochastic parrots than the typical LLM is. |
|
|
| ▲ | bilbo-b-baggins 6 days ago | parent | prev | next [-] |
| Actually internally we do know what’s going on these days. Anthropic put out a white paper detailing how Claude can’t math but many math examples are out there so Claude can fake it. I wish you’d stop magic LLMs some kind of magic thing they aren’t. |
| |
| ▲ | hodgehog11 4 days ago | parent [-] | | I work in the theory of deep learning, so I can say with some authority that while we know a really good number of things, and are able to probe the internals much better than most of the public realises, when it comes to philosophical questions that compare their nature with humans, we have a long, long way to go. The biggest problem is that we're still trying to work out what it is we even want to know that will tell us whether we have achieved AGI or not. Linear probes and autoencoders have been useful, but we're quickly reaching the limits of those techniques. And don't even get me started on approaches to theory that operate by cherry-picked examples. Anthropic's contributions have been beneficial to the field, but are far from conclusive. |
|
|
| ▲ | bwfan123 6 days ago | parent | prev [-] |
| Humans build theories of how things work. llms dont. Theories are deterministic symbolic representation of the chaotic worlds of meaning . Take the turing machine for example as a theory of computation in general, euclidean geometry as a theory for space, and newtonian mechanics as a theory for motion. A theory gives 100% correct predictions. Although the theory itself may not model the world accurately. Such feedback between the theory, and its application in the world causes iterations to the theory. From newtonian mechanics to relativity etc. Long story short, the LLM is a long way away from any of this. And to be fair to LLMs, the average human is not creating theories, it takes some genius to create them (newton, turing, etc). Understanding something == knowing the theory of it. |
| |
| ▲ | hodgehog11 6 days ago | parent [-] | | > Humans build theories of how things work. llms dont. Theories are deterministic symbolic representation of the chaotic worlds of meaning What made you believe this is true? Like it or not, yes, they do (at least to the best extent of our definitions of what you've said). There is a big body of literature exploring this question, and the general consensus is that all performant deep learning models adopt an internal representation that can be extracted as a symbolic representation. | | |
| ▲ | bwfan123 6 days ago | parent [-] | | > What made you believe this is true? I am yet to see a theory coming of the LLM that is sufficiently interesting. My comment was answering your question of what does it mean to "understanding something". My answer to that is: understanding something is knowing the theory of it. Now, that begs the question of what is a theory. And to answer that, a theory comprises of building block symbols and a set of rules to combine them. for example, building blocks for space (and geometry) could be points, lines, etc. The key point in all of this is symbolism as abstractions to represent things in some world. | | |
| ▲ | hodgehog11 6 days ago | parent [-] | | The "sufficiently interesting" part is the most important qualifier here. My response was talking about theories and representations that we already know, either instinctively from near-birth, or from learned experience. We have not seen anything unique from LLMs because they do not appear to have an internal understanding (in the same sense that I was talking about) that is as broad as an adult human. But that doesn't mean it lacks any understanding. > The key point in all of this is symbolism as abstractions to represent things in some world. The difficulty is understanding how to extract this information from the model, since the output of the LLM is actually a very poor representation of its internal state. |
|
|
|