Remix.run Logo
janalsncm 21 hours ago

It would be extremely interesting if we could use this kind of model surgery approach to tack on additional modalities. For example, adding vision to a text only model.

Another very interesting thing would be modulating compute at the token level. Default is 0 loops, maybe 1 loop is better, and 10 loops is even better than that.