| ▲ | obblekk 5 hours ago | |||||||||||||
This is the first time in recent memory that software has had high variable costs so the surprise at these rules is understandable. In this case, a the difference in context cache hit rate between openclaw and antigravity. For example if openclaw starts every message with the current time hh:mm:ss at the top of the context window, followed by the full convo history, it would have a cache hit rate if ~0. Simply moving the updated time to each new message incrementally would increase hit rate to over 90%. Idk if openclaw does this but there’s many many optimizations like this. And worse, thrashing the cache has non linear effects on the server as more and more users’ cached contexts get evicted from cache due to high cardinality. The cost to serve difference could be >10x. Google is the furthest behind on coding agent adoption and has all the incentives to allow off policy use to grow demand. But it would probably be better to design their own optimized openclaw and serve that for free than let any unoptimized requests in. | ||||||||||||||
| ▲ | martinald 5 hours ago | parent [-] | |||||||||||||
It's a fair point, but I think people are thinking too much about 'cost' and 'subsidies' and just the fact that everyone is so compute stretched. While it's sort of the same thing, I think it's much more a symptom of not enough compute vs some 'dump cheap tokens' on the market strategy. One related thought I had was that given OpenAI is the only one _not_ doing this of the big3, it probably indicates they have a lot more spare compute. It doesn't make sense to me that given the absolutely brutal competition any of these companies would block use of 3rd party apps unless they had to. They clearly have enough cash, so I don't think it's about money - I think it's that an indicator that Google and Anthropic are really struggling with keeping up with demand. Given Anthropics reliability issues last week this does not surprise me. | ||||||||||||||
| ||||||||||||||