| ▲ | bryanlarsen 9 hours ago | |
It works better, until you run out of tokens. Running out of tokens is something that used to never happen to me, but this month now regularly happens. Maybe I could avoid running out of tokens by turning off 1M tokens and max effort, but that's a cure worse than the disease IMO. | ||
| ▲ | cube2222 6 hours ago | parent [-] | |
I would risk a guess that people have a wrong intuition about the long-context pricing and are complaining because of that. Yeah, the per-token price stays the same, even with large context. But that still means that you're spending 4x more cache-read tokens in a 400k context conversation, on each turn, than you would be in a 100k context conversation. | ||