Remix.run Logo
taminka 20 hours ago

whisper is great, i wonder why youtube's auto generated subs are still so bad? even the smallest whisper is way better than google's solution? is it licensing issue? harder to deploy at scale?

briansm 17 hours ago | parent | next [-]

I believe youtube still uses 40 mel-scale vectors as feature data, whisper uses 80 (which provides finer spectral detail but is computationally more intensive to process naturally, but modern hardware allows for that)

ec109685 11 hours ago | parent | prev [-]

You’d think they’d use the better model for at least videos that have a large view counts (they already do that when deciding compression optimizations).