▲ | trashtester 2 days ago | |||||||||||||||||||||||||||||||||||||||||||||||||
> Sure, but talking about Turing completeness is not about efficiency, but about the computational ability of a system. I know. That is part of my claim that this "talking about Turing completeness" is a distraction. Specifically because it ignores efficiency/speed. > Again, the point is not that this is a good way of doing something, but that it serves as a way to point out that what it takes to make an LLM Turing complete is so utterly trivial. And again, I KNOW that it's utterly trivial to create a Turing Complete system. I ALSO know that a Turing Complete system can perform ANY computation (it pretty much defines what a computation is), given enough time and memory/storage. But if such a calculation takes 10^6 times longer than necessary, it's also utterly useless to approach it in this way. Specifically, the problem with Turing Completeness is that it implies the ability to create global branches/dependencies in the code based on the output of any previous computation step. > The "feature set" of Turing Completeness can be reduced to a loop, an array lookup, and an IO port. This model is intrinsically single threaded, so the global branches/dependencies requirement is trivially satisifed. Generally, though, if you want to be able to distribute a computation, you have to pretty much never allow the results of a computation of any arbitrary compute thread to affect the next computation on any of the other threads. NOBODY would be able to train LLM's that are anything like the ones we see today, if they were not willing to make that sacrifice. Also, downstream from this is the hardware optimizations that are needed to even run these algorithms. While you _could_ train any of the current LLM's on large CPU clusters, a direct port would require perhaps 1000x more hardware, electricity, etc than running it on GPU's or TPU's. Not only that, but if the networks being trained (+ some amount of training data) couldn't fit into the fast GPU/TPU memory during training, but instead had to be swapped in and out of system memory or disk, then that would also cause orders of magnitude of slowdown, even if using GPU/TPU's for the training. In other words, what we're seeing is a trend towards ever increasing coupling between algorithms being run and the hardware they run on. When I say that thinking in terms of Turing Completeness is a distraction, it doesn't mean it's wrong. It's just irrelevant. | ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | vidarh a day ago | parent [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
> NOBODY would be able to train LLM's that are anything like the ones we see today, if they were not willing to make that sacrifice. Every LLM we have today is Turing complete if you put a loop around it that uses context as a means to continue the state transitions so they haven't made that sacrifice, is the point. Because Turing completeness does not mean all, or most, or even any of your computations need to be carried out like in a UTM. It only means it needs the theoretical ability. They can take any set of shortcuts you want. | ||||||||||||||||||||||||||||||||||||||||||||||||||
|