Remix.run Logo
stavros 6 hours ago

Everything bursty will use economic incentives to smooth the load. I'm not sure how they'd do that with workload scheduling orchestration when you have latency-sensitive loads and there are e.g. twice as many requests at midday as at midnight.

toomuchtodo 6 hours ago | parent [-]

You decouple the workloads from human interaction (ie when you submit the job to the queue vs when it is scheduled to execute) so when they run is not a consideration, if possible. The economic incentives encourage solving this, and if it can’t be solved, it buckets customer cohort by willingness (or unwillingness) to pay for access during peak times.

stavros 6 hours ago | parent [-]

Sure, but if I ask the LLM a question, I'd like it to respond now, instead of tonight.

toomuchtodo 5 hours ago | parent [-]

Certainly, interactive workloads aren’t realistic for time shifting, but agentic coding likely is. Package everything up and ship it as a job, getting a bundle back asynchronously.

stavros 5 hours ago | parent [-]

I don't know, my agentic coding is pretty interactive. Maybe once the plan is done, sure. That would be interesting, though OpenAI already does this with batch workloads.