Remix.run Logo
bob1029 3 hours ago

I have a really strong suspicion that there is something different about OAI prepaid tokens in the API vs elsewhere. I've been able to get away with spending less than $150/m on average while many peers are hitting 10x that.

I am curious how many on HN have manually configured their copilot install with a custom OAI token for 5.4/5.5. In my experience, the performance difference over the built in subscription models is immense. This setup tends to solve the problem so quickly and reliably that any desire to have it run while I'm asleep seems absolutely ridiculous. The performance is constant throughout the day and week.

I think what might be happening is that we are chasing the cost optimization rabbit a little bit too hard. Capability is weird dimension to quantify. A weaker model is not weaker in a linear way. It's usually this incredibly tall brick wall of a discrete go/no-go. If the model can't do the task, it doesn't matter how cheap the tokens are. Something approaching the inverse is also largely true.

Focus on the capability (is this giving my customer what they want) instead of the cost, and you will likely find that the cost never reaches a threshold where you even begin to worry about it. Starting from a position of cost optimization tends to spiral into a dark place.

throawayonthe 2 hours ago | parent [-]

> any desire to have it run while I'm asleep seems absolutely ridiculous.

could that be the difference from your peers? :p (real question b/c if you brought it up you're probably seeing others do it)

bob1029 2 hours ago | parent [-]

The point I'm trying to make is the reason a lot of people are resorting to the 24/7 Ralph loops is because they're using weaker models that need an incredible number of attempts to make any progress. The Death Star has different game theoretic implications. You probably don't need it to be lasering entire planets while you sleep, assuming the laser system actually works as advertised. I've never had a copilot run that took so long that I had to get up from my PC. Maybe 10 minutes. What the hell can run for 24 hours and still converge in a meaningful way?