▲ | vessenes 9 hours ago | |||||||
Roughly 1/10 the cost of Opus 4.1, 1/2 the cost of Sonnet 4 on per token inference basis. Impressive. I'd love to see a fast (groq style) version of this served. I wonder if the architecture is amenable. | ||||||||
▲ | aitchnyu 6 hours ago | parent | next [-] | |||||||
Isnt it a 3x rate difference? 0.7$ for Qwen3-VL vs 3$ for Sonnet 4? | ||||||||
| ||||||||
▲ | petesergeant 8 hours ago | parent | prev [-] | |||||||
Cerebras are hosting other Qwen models via OpenRouter, so probably |