| ▲ | Bolwin 3 hours ago | ||||||||||||||||||||||||||||||||||
Yeah that's a massive assumption they're making. I remember musk revealed Grok was multiple trillion parameters. I find it likely Opus is larger. I'm sure Anthropic is making money off the API but I highly doubt it's 90% profit margins. | |||||||||||||||||||||||||||||||||||
| ▲ | jychang 2 hours ago | parent | next [-] | ||||||||||||||||||||||||||||||||||
> I find it likely Opus is larger. Unlikely. Amazon Bedrock serves Opus at 120tokens/sec. If you want to estimate "the actual price to serve Opus", a good rough estimate is to find the price max(Deepseek, Qwen, Kimi, GLM) and multiply it by 2-3. That would be a pretty close guess to actual inference cost for Opus. It's impossible for Opus to be something like 10x the active params as the chinese models. My guess is something around 50-100b active params, 800-1600b total params. I can be off by a factor of ~2, but I know I am not off by a factor of 10. | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
| ▲ | nbardy an hour ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
You can estimate on tok/second The Trillions of parameters claim is about the pretraining. It’s most efficient in pre training to train the biggest models possible. You get sample efficiency increase for each parameter increase. However those models end up very sparse and incredibly distillable. And it’s way too expensive and slow to serve models that size so they are distilled down a lot. | |||||||||||||||||||||||||||||||||||
| ▲ | 2 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
| [deleted] | |||||||||||||||||||||||||||||||||||
| ▲ | aurareturn 2 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||
Anthropic CEO said 50%+ margins in an interview. I'm guessing 50 - 60% right now. | |||||||||||||||||||||||||||||||||||