Remix.run Logo
wongarsu an hour ago

One thing is robotics. Both for training robotics AI, and to let robots test hypothetical actions before comitting to them. I don't think world models are stable enough for either yet

The other is creating multi-modal models with a better understanding of our world. LLMs often fail at incredibly basic spatial reasoning ("someone left a package in front of your apartment, describe going there", or the "should I drive to the car wash or go there", etc). World models excel at these kinds of things (in theory). They develop a great understanding of physical spaces, object interactions, etc. They can simulate fluids, rigid body physics etc. You "just" have to get really good at making world models, then somehow marry them with an LLM in a way that ensures the LLM can benefit from the world model's training data. Nobody has managed to really do that yet

So lots of hopes for the future. Until then they get commercialized as video models, or ways to experience your favorite forest, or to have a really bad video game ... whatever can be sold on a short time horizon to finance the actual goals