Remix.run Logo
jrk 4 hours ago

I think they weren’t asking “why can’t Gemini 3, the model, just do good transcription,” they were asking “why can’t Gemini, the API/app, recognize the task as something best solved not by a single generic model call, but by breaking it down into an initial subtask for a specialized ASR model followed by LLM cleanup, automatically, rather than me having to manually break down the task to achieve that result.”