Remix clone Hacker News

new | show | ask | jobs Github

	▲	charcircuit 2 days ago
		Visual reasoning models. Having a computer being able to understand what is happening in the real world is very useful.
	▲	ACCount37 2 days ago \| parent [-]
		Those are LLMs with an extra modality bolted to them. Which is good - that it works this well speaks of the generality of autoregressive transformers, and the "reasoning over image data" progress with things like Qwen3-VL is very impressive. It's a good capability to have. But it's not a separate thing from the LLM breakthrough at all. Even the more specialized real time robotics AIs often have a bag of transformers backed by an actual LLM.