Remix.run Logo
jeremyjh 2 hours ago

No, that simply is not the case. The whole point of deep learning - and the reason it has been successful in so many domains over the last 20 years - is that generalization does occur. Leela will kick your ass at chess whether she's seen the position before or not, even if her search depth is set at 1 ply.

In the case of LLMs, the compression ratio alone absolutely requires this.

IAmGraydon an hour ago | parent [-]

So what do you think is the reason it could do 30x8 and not 31x7?