Remix.run Logo
MediaSquirrel 12 hours ago

re: Whisper v3 -- how is this possible? Whisper has a 30s context window. You have to chunk it.

sipjca 2 hours ago | parent [-]

Wondering similar. It certainly can run beyond 30 seconds but at some point I believe the output should degrade

Plus you could do actual batch inference instead. Or if you must carry forward the context you could still do it linearly, but the mem usage shouldn’t just explode