Remix.run Logo
tim333 7 days ago

There are similarities with that one. From their website:

>It is comprised of a spatiotemporal video tokenizer, an autoregressive dynamics model, and a simple and scalable latent action model.

my point is more people can try different models and algorithms rather than having to stick to LLMs.