Do we know if this is better than Nvidia Parakeet V3? That has been my go-to model locally and it's hard to imagine there's something even better.

m1el an hour ago | parent | next [-]

I've been using nemotron ASR with my own ported inference, and happy about it:

	▲	Multicomp 9 minutes ago \| parent [-]
		I'm so amazed to find out just how close we are to the start trek voice computer. I used to use Dragon Dictation to draft my first novel, had to learn a 'language' to tell the rudimentary engine how to recognize my speech. And then I discovered [1] and have been using it for some basic speech recognition, amazed at what a local model can do. But it can't transcribe any text until I finish recording a file, and then it starts work, so very slow batches in terms of feedback latency cycles. And now you've posted this cool solution which streams audio chunks to a model in infinite small pieces, amazing, just amazing. Now if only I can figure out how to contribute to Handy or similar to do that Speech To Text in a streaming mode, STT locally will be a solved problem for me. [1] https://github.com/cjpais/Handy

czottmann 2 hours ago | parent | prev | next [-]

I liked Parakeet v3 a lot until it started to drop whole sentences, willy-nilly.

tylergetsay 2 hours ago | parent | prev | next [-]

I've been using Parakeet V3 locally and totally ancedotaly this feels more accurate but slightly slower

whinvik an hour ago | parent | prev [-]

Came here to ask the same question!