Remix.run Logo
jdeng 5 hours ago

Glad to to see open source models are catching up and treat vision as first-class citizen (a.k.a native multimodal agentic model). GLM and Qwen models takes different approach, by having a base model and a vision variant (glm-4.6 vs glm-4.6v).

I guess after Kimi K2.5, other vendors are going to the same route?

Can't wait to see how this model performs on computer automation use cases like VITA AI Coworker.

https://www.vita-ai.net/