| ▲ | nextaccountic 3 hours ago | |
do you pay for the full context every prompt? what happened with the idea of caching the context server side? | ||
| ▲ | davesque 3 hours ago | parent [-] | |
You don't. Most of the time (after the first prompt following a compaction or context clear) the context prefix is cached, and you pay something like 10% of the cost for cached tokens. But your total cost is still roughly the area under a line with positive slope. So increases quadratically with context length. | ||