| ▲ | andy99 a day ago | |
Gemini Flash light is $.1/Million input tokens, Claude Haiku is $1/Million. Obviously input dominates here if it’s just a classifier. Training data easily can top 10 Trillion tokens - An earlier Kimi K2 was trained on 15T and even HF SmolLM 3B was trained on 11T. So if I calculate right, it’s $100k-$1M per trillion tokens or $1-10M for a full dataset. That’s way more than I expected, there is probably also some discount at that volume :) | ||