Remix.run Logo
wizzwizz4 10 months ago

The architecture of the model does place limits on how much computation can be performed per token generated, though. Combined with the window size, that's a hard bound on computational complexity that's significantly lower than a Turing machine – unless you do something clever with the program that drives the model.

vidarh 10 months ago | parent [-]

Hence the requirement for using the context for IO. A Turing machine requires two memory "slots" (the position of the read head, and the current state) + IO and a loop. That doesn't require much cleverness at all.