| ▲ | GaggiX 3 hours ago | |||||||||||||||||||
Gpt4o mini transcribe is better and actually realtime. Whisper is trained to encode the entire audio (or at least 30s chunks) and then decode it. | ||||||||||||||||||||
| ▲ | mdrzn 3 hours ago | parent | next [-] | |||||||||||||||||||
So "gpt4o mini transcribe" is not just whisper v3 under the hood? Btw it's $0.006 / minute For Whisper API online (with v3 large) I've found "$0.00125 per compute second" which is the cheapest absolute I've ever found. | ||||||||||||||||||||
| ||||||||||||||||||||
| ▲ | emmettm 3 hours ago | parent | prev [-] | |||||||||||||||||||
The linked article claims the average word error rate for Voxtral mini v2 is lower than GPT-4o mini transcribe | ||||||||||||||||||||
| ||||||||||||||||||||