How does other Local models perform on this task?
we've tried Qwen3, Llama4, gemma3. But gpt-oss has been the best performing model so far.