Remix.run Logo
WarmWash 6 hours ago

The actual breakthrough with Genie is being able to turn around and look back, and seeing the same scene that was there before. A few other labs have similar world simulators, but they all struggle badly with keeping coherence of things not in view. Hence why they always walk forwards and never look around.

abraxas 3 hours ago | parent | next [-]

What about Fei Fei Li's lab? I think they are generating true 3D worlds rather than frames of a video?

Although that probably precludes her from having animations in those worlds...

nozbufferHere 5 hours ago | parent | prev | next [-]

Still amazed it took ML people so long to realize they needed and explicit representation to cache stuff.

Legend2440 5 hours ago | parent | next [-]

Genie does not use an explicit representation:

>Genie 3’s consistency is an emergent capability. Other methods such as NeRFs and Gaussian Splatting also allow consistent navigable 3D environments, but depend on the provision of an explicit 3D representation. By contrast, worlds generated by Genie 3 are far more dynamic and rich because they’re created frame by frame based on the world description and actions by the user.

emmettm 2 hours ago | parent | prev [-]

The representation is learned. Also, see Sutter's "Bitter Lesson" essay

6 hours ago | parent | prev | next [-]
[deleted]
sfn42 6 hours ago | parent | prev [-]

And what if I go somewhere then go back there a week later?

jsheard 6 hours ago | parent [-]

Best they can do is 60 seconds, for now at least.

autonomousErwin 5 hours ago | parent [-]

Makes you wonder what the TTL caching for our universe is.

dabbz 4 hours ago | parent [-]

Whatever the speed of light is I would imagine