Remix.run Logo
theologic 4 days ago

I always thought this was a great implementation if you have a Cuda layer: https://github.com/rgcodeai/Kit-Whisperx

I had an old Acer laptop hanging around, so I implemented this: https://github.com/Sanborn-Young/MP3ToTXT

I forget all the details of my tweaks, but I remember that I had better throughput on my version.

I know the OP talked about wanting it local, but thomasmol/whisper-diarization on replicate is fast and cheap. Here's a hacked front end to parse teh JSON: https://github.com/Sanborn-Young/MP3_2transcript