Remix clone Hacker News

new | show | ask | jobs Github

	▲	santiagobasulto 3 hours ago
		Not at all, I had the same feeling as yours the first time I read it. I think the key is that the "encoder" they're using is just a linear projection, which is probably pretty fast and memory efficient. A single matmul vs a ViT encoder is probably a huge win.