Remix.run Logo
rao-v 6 hours ago

I’d love to believe this is real, but I’m pretty sure you will lose performance on a “fair” mix of tasks, even after fine tuning. I know multiple teams have explored recurrent layers (great for limited VRAM) but I don’t think it’s ever been found to be optimal.