| ▲ | jakewins 2 days ago | |
I mean, I'm just some guy, but in my mind: - They are not making progress, currently. The elephant-in-the-room problem of hallucinations is exactly the same or, as I said above, worse as it was 3 years ago - It's clearly possible to solve this, since we humans exist and our brains don't have this problem There's then two possible paths: Either the hallucinations are fundamental to the current architecture of LLMs, and there's some other aspect about the human brains configuration that they've yet to replicate. Or the hallucinations will go away with better and more training. The latter seems to be the bet everyone is making, that's why there's all these data centers being built right? So, either larger training will solve the problem, and there's enough training data, silica molecules and electricity on earth to perform that "scale" of training. There's 86B neurons in the human brain. Each one is a stand-alone living organism, like a biological microcontroller. It has constantly-mutating state, memory: short term through RNA and protein presence or lack thereof, long term through chromatin formation, enabling and disabling it's own DNA over time, in theory also permanent through DNA rewriting via TEs. Each one has a vast array of input modes - direct electrical stimulation, chemical signalling through a wide array of signaling molecules and electrical field effects from adjacent cells. Meanwhile, GPT-4 has 1.1T floats. No billions of interacting microcontrollers, just static floating points describing a network topology. The complexity of the neural networks that run our minds is spectacularly higher than the simulated neural networks we're training on silicon. That's my personal bet. I think the 88B interconnected stateful microcontrollers is so much more capable than the 1T static floating points, and the 1T static floating points is already nearly impossibly expensive to run. So I'm bearish, but of course, I don't actually know. We will see. For now all I can conclude is the frontier model developers lie incessantly in every press release, just like their LLMs. | ||
| ▲ | xmcqdpt2 a day ago | parent | next [-] | |
The complexity of actual biological neural networks became clear to me when I learned about the different types of neurons. https://en.wikipedia.org/wiki/Neural_oscillation There are clock neurons, ADC neurons that transform analog intensity of signal into counts of digital spikes, there are neurons that integrate signals over time, that synchronizes together etc etc. Transformer models have none of this. | ||
| ▲ | empiricus 2 days ago | parent | prev [-] | |
Thanks, that's a reasonable argument. Some critique: based on this argument it is very surprising that LLM work so well, or at all. The fact that even small LLM do something suggests that the human substrate is quite inefficient for thinking. Compared to LLMs, it seems to me that 1. some humans are more aware of what they know; 2. humans have very tight feedback loops to regulate and correct. So I imagine we do not need much more scaling, just slightly better AI architectures. I guess we will see how it goes. | ||