| ▲ | thraxil 2 days ago | |
Thanks. Yeah, for now we're moving to 3.1 flash lite as that's the new cheapest at $.25/1M and is also still "good enough". 2.5 flash is more expensive at $.30/1M (looks like Deep Infra charges the same as GCP/VertexAI for it). I might check them out for Gemma though. We benchmarked Gemma2 when that came out and it wasn't remotely usable for us largely because the context window was way too small. It looks like 3 or 4 might be worth evaluating though. | ||
| ▲ | nl a day ago | parent [-] | |
Xiaomi's mimo-v2-flash is great if you care about speed and performance - it's 1/10 the price of Gemini 3.1 Flash Lite and faster (on OpenRouter). GCP does server other non-Google models, but I'm not sure what they have other than Anthropic models. I don't think Haiku is a great model though. | ||