Remix.run Logo
data-ottawa 6 hours ago

With gpt5 did you try adjusting the reasoning level to "minimal"?

I tried using it for a very small and quick summarization task that needed low latency and any level above that took several seconds to get a response. Using minimal brought that down significantly.

Weirdly gpt5's reasoning levels don't map to the OpenAI api level reasoning effort levels.

barrell 5 hours ago | parent [-]

Reasoning was set to minimal and low (and I think I tried medium at some point). I do not believe the timeouts were due to the reasoning taking to long, although I never streamed the results. I think the model just fails often. It stops producing tokens and eventually the request times out.