| ▲ | Jackson__ 2 hours ago | |
So they spent all of their R&D to copy deepseek, leaving none for the singular novel added feature: vision. To quote the hf page: >Behind vision-first models in multimodal tasks: Mistral Large 3 can lag behind models optimized for vision tasks and use cases. | ||
| ▲ | Ey7NFZ3P0nzAe an hour ago | parent [-] | |
Well, behind "models" not "langual models". Of course models purely made for image stuff will completely wipe it out. The vision language models are useful for their generalist capabilities | ||