| ▲ | daemonologist 4 hours ago | |||||||
Even if it's larger, OpenRouter has DeepSeek v3.2 (685B/37B active) at $0.26/0.40 and Kimi K2.5 (1T/32B active) at $0.45/2.25 (mentioned in the post). | ||||||||
| ▲ | johndough 3 hours ago | parent [-] | |||||||
Opus 4.6 likely has in the order of 100B active parameters. OpenRouter lists the following throughput for Google Vertex:
For GLM 4.7, that makes 143 * 32B = 4576B parameters per second, and for Llama 3.3, we get 70 * 70B = 4900B, which makes sense since denser models are easier to optimize. As a lower bound, we get 4576B / 42 ≈ 109B active parameters for Opus 4.6. (This makes the assumption that all three models use the same number of bits per parameter and run on the same hardware.) | ||||||||
| ||||||||