▲ | oidar 4 days ago | |
What's the best solution right now for TTS that supports speaker diarisation? | ||
▲ | makaimc 4 days ago | parent | next [-] | |
AssemblyAI (YC S17) is currently the one that stands out in the WER and accuracy benchmarks (https://www.assemblyai.com/benchmarks). Though its models are accessed through a web API rather than locally hosted, and speaker diarization is enabled through a parameter in the API call (https://www.assemblyai.com/docs/speech-to-text/pre-recorded-...). | ||
▲ | xnx 4 days ago | parent | prev [-] | |
I like this version of Whisper which has diarization built in: https://github.com/Purfview/whisper-standalone-win |