| ▲ | ks2048 7 hours ago | |
Does anyone benchmark these models for text-to-speech using traditional word-error-rates? It seems audio-input Gemini is a lot cheaper than Google Speech-to-text. | ||
| ▲ | simonw 7 hours ago | parent [-] | |
Here's one: https://voicewriter.io/speech-recognition-leaderboard "Real-World Speech-to-text API Leaderboard" - it includes scores for Gemini 2.5 Pro and Flash. | ||