Remix clone Hacker News

new | show | ask | jobs Github

	▲	toddmorey a day ago
		I’m actually amazed at the output since GLM doesn’t have eyes. If GLM 5.2 costs 1/5 as much, seems like it could be set up to reach out to a multimodal model for vision tasks when required. Closer to parity but probably still significantly cheaper.
	▲	horsawlarway a day ago \| parent \| next [-]
		I'm also very impressed at the output given the lack of image support. They picked a task that heavily favors a model that can do multi-modal with images, and GLM still came within striking distance. What I'm hearing from this article is that the next generation of open models that includes better multi-modal support are basically no-brainers for adoption. Seems like a HUGE win for Z.ai and open models in general here.
	▲	killingtime74 13 hours ago \| parent \| prev [-]
		Yes, it could just make one call to a multimodal llm to describe the scene