Remix.run Logo
drewbuschhorn 4 days ago

You should throw in some diarization, there's some pretty effective libraries that don't need pertraining on the voice separation in python.

nvdnadj92 4 days ago | parent | next [-]

I would suggest 2 speaker-diarization libraries:

- https://huggingface.co/pyannote/speaker-diarization-3.1 - https://github.com/narcotic-sh/senko

I personally love senko since it can run in seconds, whereas py-annote took hours, but there is a 10% WER (word error rate) that is tough to get around.

Pavlinbg 4 days ago | parent | prev [-]

Nice suggestion, I'll look them up.