Remix.run Logo
ks2048 7 hours ago

Does anyone benchmark these models for text-to-speech using traditional word-error-rates? It seems audio-input Gemini is a lot cheaper than Google Speech-to-text.

simonw 7 hours ago | parent [-]

Here's one: https://voicewriter.io/speech-recognition-leaderboard

"Real-World Speech-to-text API Leaderboard" - it includes scores for Gemini 2.5 Pro and Flash.