| ▲ | nl 2 hours ago | |
The reason they are called "world models" is because the internal representation of what they display represents a "world" instead of a video frame or image. The model needs to "understand" geometry and physics to output a video. Just because there are errors in this doesn't mean it isn't significant. If a machine learning model understands how physical objects interact with each other that is very useful. | ||
| ▲ | godelski an hour ago | parent | next [-] | |
Do they?I'm unconvinced. The tiger and girl video is the clearest example. Nothing about that seems world representing | ||
| ▲ | slashdave an hour ago | parent | prev | next [-] | |
> The model needs to "understand" geometry and physics to output a video. No it doesn't. It merely needs to mimic. | ||
| ▲ | PunchyHamster an hour ago | parent | prev [-] | |
I think the reason is "those words look nice on promo material". It is absolutely build to trigger hype from the clueless | ||