Remix.run Logo
CamperBob2 2 days ago

We’ve known for well over a decade that you cannot cram real-world spatial dynamics into those models. It is a clear impedance mismatch.

Then again, not much that we "knew" a decade ago is still relevant today. Of course transformer networks have proven capable of representing spatial intelligence. How could they work with 2D images, if not?