Remix clone Hacker News

new | show | ask | jobs Github

	▲	SoftTalker 5 hours ago
		LLMs are trained on text. Why would we expect them to understand a visual and tactile 3D world?
	▲	azinman2 5 hours ago \| parent [-]
		Because they’re also multimodal vLLMs.