Remix clone Hacker News

new | show | ask | jobs Github

	▲	energy123 4 hours ago
		Isn't the Sora video model a ViT with spatiotemporal inputs (so they've found a way to compress that down), but at the same time LeCunn wouldn't consider that a world model?