Remix.run Logo
twtw99 7 hours ago

This is great, but i guess they are feeling the heat from Codex resetting limits in the last month quite a bit.

stavros 7 hours ago | parent | next [-]

I think they're feeling the heat from growing too quickly so they want to incentivize people to spread the load more evenly.

toomuchtodo 6 hours ago | parent [-]

Very much like electric utility time of day pricing, using economic incentives to shift demand to trough periods.

Perhaps an opportunity for them to improve workload scheduling orchestration, like submitting a job to a distributed computing cluster queue, to smooth demand and maximize utilization.

stavros 6 hours ago | parent [-]

Everything bursty will use economic incentives to smooth the load. I'm not sure how they'd do that with workload scheduling orchestration when you have latency-sensitive loads and there are e.g. twice as many requests at midday as at midnight.

toomuchtodo 6 hours ago | parent [-]

You decouple the workloads from human interaction (ie when you submit the job to the queue vs when it is scheduled to execute) so when they run is not a consideration, if possible. The economic incentives encourage solving this, and if it can’t be solved, it buckets customer cohort by willingness (or unwillingness) to pay for access during peak times.

stavros 6 hours ago | parent [-]

Sure, but if I ask the LLM a question, I'd like it to respond now, instead of tonight.

toomuchtodo 6 hours ago | parent [-]

Certainly, interactive workloads aren’t realistic for time shifting, but agentic coding likely is. Package everything up and ship it as a job, getting a bundle back asynchronously.

stavros 5 hours ago | parent [-]

I don't know, my agentic coding is pretty interactive. Maybe once the plan is done, sure. That would be interesting, though OpenAI already does this with batch workloads.

Analemma_ 7 hours ago | parent | prev [-]

The insanely competitive market for LLMs is great for us, but if I were one of the investors in these companies it wouldn't exactly fill me with confidence that my $500 billion spent on datacenters and Nvidia cards is going to get repaid ten times over like they're claiming. I'm still getting very strong "this is a commodity; margins will be driven inexorably to zero" vibes from these products.