| ▲ | brookst 6 hours ago | |||||||
My hypothesis is that people who have continuous sessions that keep the cache valid see the behavior you’re describing: at 95% cache hits (or thereabouts), the max plan goes a long way. But people who go > 5 minutes between prompts and see no cache, usage is eaten up quickly. Especially passing in hundreds of thousands of tokens of conversation history. I know my quote goes a lot further when I sit down and keep sessions active, and much less far when I’m distracted and let it sit for 10+ minutes between queries. It’s a guess. But n=1 and possible confirmation bias noted, it’s what I’m seeing. | ||||||||
| ▲ | nickstinemates 5 hours ago | parent | next [-] | |||||||
I run dozens, hundreds? of new sessions every day. I don't have long lived sessions. 1 session = 1 task. | ||||||||
| ▲ | HauntingPin 5 hours ago | parent | prev [-] | |||||||
Why is it our job to micromanage all this when it used to work fine without? Something's clearly changed for the worse. Why are people insisting on pushing the responsibility on paying users? | ||||||||
| ||||||||