▲ | taminka 20 hours ago | |
whisper is great, i wonder why youtube's auto generated subs are still so bad? even the smallest whisper is way better than google's solution? is it licensing issue? harder to deploy at scale? | ||
▲ | briansm 17 hours ago | parent | next [-] | |
I believe youtube still uses 40 mel-scale vectors as feature data, whisper uses 80 (which provides finer spectral detail but is computationally more intensive to process naturally, but modern hardware allows for that) | ||
▲ | ec109685 11 hours ago | parent | prev [-] | |
You’d think they’d use the better model for at least videos that have a large view counts (they already do that when deciding compression optimizations). |