▲	ninetyninenine 2 days ago
		>This is such a weird take to me. We know exactly how LLMs and neural networks in general work. Why do you make such crazy statements when the article you’re responding to literally says the opposite of this. Like your statement is categorically false and both the article and I are in total contradiction to it. The rest of your post doesn’t even get into the nuances of what we do and don’t understand. > They're just highly scaled up versions of their smaller curve fitting cousins. For those we can even make pretty visualizations that show exactly what is happening as the network "learns". This proves nothing. Additionally is in direct contradiction to what the OP and I wrote. You describe the minuscule aspects that we do understand but neglect to describe the fact that we overall don’t understand. Saying we understand curve fitting is like saying we understand the human brain because we can completely understand how atoms work and the human brain is made up of atoms therefore we understand the human brain. No. We understand atoms. We don’t understand the human brain even though we know the brain is just a bunch of atoms. The generalization of this is that the amount we don’t understand eclipses by a massive amount the parts we do understand. Let me spell it out to your brain what this means overall: we don’t understand shit. > I don't mean "we can see parts of the brain light up", I mean "we can see the cuts and bends each ReLU is doing to the input". Even with direct insight on every single signal propagating through the feed forward network we still don’t understand it. Like the OPs article he says it’s a scaling problem. Again I don’t understand how come you’re only referring to the miniscule aspects we do understand and not even once mentioning the massive aspects we don’t understand. > We built these things, we know exactly how they work. There's no surprise beyond just how good prediction accuracy gets with a big enough model. No we don’t. Sometimes I find basic rationality and common sense just doesn’t work with people. What ends up working is citing experts in the field saying the exact fucking opposite of you: https://youtu.be/qrvK_KuIeJk?t=284 That’s Geoffrey Hinton, the guy who made ML relevant again in the last decade. Like literally word for word saying exactly opposite of what you’re saying. The interviewer even said “we built the thing, we know how it works” and Geoffrey is like “no we don’t”. Bro. Wake up. > Deep Neural Networks are also a very different architecture from what is going on in our brains (which work more like Spiking Neural Networks) and our brains don't do backpropagation, so you can't even make direct comparisons. Bro we don’t understand deep neural networks. Throwing big words like back propagation and even understanding that it’s just the chain rule from calculus doesn’t mean shit. We overall don’t understand it because it’s a scaling problem. There’s no theory or model that characterizes how a signal propagates through a trained network. It’s like saying you know assembly language and therefore you understand all of Linux. No we understand assembly but these learning algorithms are what built Linux and thus we don’t understand it because we didn’t build it. We don’t understand the human brain and we don’t understand deep neural networks. You know what. Just take a look at what Geoffrey Hinton says. Like if you feel my take is so bizarre and felt the need to comment on it I hope his take rewires your brain and helps you realize you’re the one out of touch. Rationality rarely changes people’s minds, but someone with a higher reputation often is capable of actually getting a person out of their own biased headspace. So listen to what he says then couple it with the explanation here and completely flip your opinion. Or don’t. I rarely see people just do an opinion flip. Humans are just too biased.