▲ | eugene3306 5 days ago | |
what's point of comparing token prices? especially for thinking models. Just now I was testing the new Qwen3-thinking model. I've run the same prompt five times. The costs I got, sorted: 0.0143, 0.0288, 0.0321, 0.0389, 0.048 . And this is for single model. Also, in my experience, sonnet-4 is cheaper than gemini-2.5-pro, despite token costs being higher. | ||
▲ | eugene3306 5 days ago | parent [-] | |
I think the proper way of estimating the cost is the cost of entire run of a test. Like in aider's leaderboard. |