▲ | sireat 4 days ago | |||||||
Basically it boils down that for most queries google/gemini-2.5-flash is the workhorse fast/cheap/good enough. Add in multimodality, 1M context and it is such a Swiss army knife. It is cheap and performant enough to run 100k queries. (Took a bit over a day and cost around 30 Euros for a major document classification task). Yes in theory this could have been done with fine-tuned BERT or maybe even with some older methods but it saved way too much time. There is another factor that may explain why Flash is #1 in most categories on OpenRouter - Flash has gotten reasonably decent at less common human languages. Most cheap (including Flash Lite) and local models mostly have English focused training. | ||||||||
▲ | karmakaze 4 days ago | parent | next [-] | |||||||
This was my initial assessment as well. Also note: > Grok I forgot about until it was too late. I was surprised by how much I prefer Grok to others. Even its persona is how I prefer it, detailed without volunteering unwanted information or sycophanty. In general I'd use Grok-3 more than 4 which is good enough for common uses. I suspect that Claude would be best, only if I gave it a long complex task with enough instructions up front so it could grind away on it while I was doing something else and not waiting on it. | ||||||||
▲ | vjerancrnjak 4 days ago | parent | prev [-] | |||||||
How do you run so many, I’m constantly exhausting the resources can’t even concurrently call 20 times? | ||||||||
|