▲ | manquer 4 days ago | |
Scaling inference not training is what OP means I believe . The smaller startups like cursor or windsurf are not competing on foundational model development. So whether new models are generationally better is not relevant to them. A cursor is competing with Claude code and both use Claude Sonnet. Even if Cursor was running a on par model on their own GPUs their inference costs will not as cheap as those of Anthropic just because they would not be operating at the same scale . Larger DCs means better deals, more knowledge about running an inference better because they are also doing much larger training runs. |