| ▲ | data-ottawa 6 hours ago | |
With gpt5 did you try adjusting the reasoning level to "minimal"? I tried using it for a very small and quick summarization task that needed low latency and any level above that took several seconds to get a response. Using minimal brought that down significantly. Weirdly gpt5's reasoning levels don't map to the OpenAI api level reasoning effort levels. | ||
| ▲ | barrell 5 hours ago | parent [-] | |
Reasoning was set to minimal and low (and I think I tried medium at some point). I do not believe the timeouts were due to the reasoning taking to long, although I never streamed the results. I think the model just fails often. It stops producing tokens and eventually the request times out. | ||