| ▲ | diggan 6 days ago |
| > There are like 10 models that are smaller and faster and outperform both of them. As someone who is currently relying on Whisper for some things, what models are those exactly? I still haven't found anything that is accurate as Whisper (large), are those models just faster or also as accurate/more accurate? |
|
| ▲ | artemisart 6 days ago | parent [-] |
| Nvidia parakeet and canary are better and faster, here is a leaderboard: https://huggingface.co/spaces/hf-audio/open_asr_leaderboard |
| |
| ▲ | diggan 6 days ago | parent | next [-] | | > Nvidia parakeet and canary are better and faster Is that based on your own experience using those and also Whisper, comparing them side-by-side? Or is that based just on those benchmark results? | | |
| ▲ | artemisart 6 days ago | parent [-] | | Yes for parakeet, but only comparing benchmark results for canary. Whisper also has severe hallucinations on silence and noise and WhisperX helps a lot, it adds voice activity detection i.e. a model to detect when someone speaks, to filter the input before running whisper. https://github.com/m-bain/whisperX |
| |
| ▲ | wahnfrieden 6 days ago | parent | prev [-] | | Parakeet isn’t more accurate than whisper large |
|