| ▲ | hydra-f 5 hours ago | ||||||||||||||||||||||||||||
Vision has become totally underappreciated, whereas I believe it brings important advantages to a model Also, a big caveat in using Qwen models has always been its speech patterns. I do wonder how Google made the Gemma lineup so good at this Let's hope Alibaba continues to open source its models | |||||||||||||||||||||||||||||
| ▲ | jwr 5 hours ago | parent [-] | ||||||||||||||||||||||||||||
Agreed. Incidentally, in my testing, qwen models (qwen3.6-35b-a3b and earlier 3.5) are WAY better with vision than gemma4-26b-a4b. I would normally want to stick with gemma4 only (I use it for spam filtering), but it just doesn't cut it for vision work, and qwen models do. | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||