Remix.run Logo
anonymousiam a day ago

Whisper is excellent, but not perfect.

I used Whisper last week to transcribe a phone call. In the transcript, the name of the person I was speaking with (Gem) was alternately transcribed as either "Jim" or "Jem", but never "Gem."

JohnKemeny 21 hours ago | parent | next [-]

Whisper supports adding a context, and if you're transcribing a phone call, you should probably add "Transcribe this phone call with Gem", in which case it would probably transcribe more correctly.

ctxc 20 hours ago | parent [-]

Thanks John Key Many!

t-3 19 hours ago | parent | prev [-]

That's at least as good as a human, though. Getting to "better-than-human" in that situation would probably require lots of potentially-invasive integration to allow the software to make correct inferences about who the speakers are in order to spell their names correctly, or manually supplying context as another respondent mentioned.

anonymousiam 16 hours ago | parent [-]

When she told me her name, I didn't ask her to repeat it, and I got it right through the rest of the call. Whisper didn't, so how is this "at least s good as a human?"

t-3 16 hours ago | parent [-]

I wouldn't expect any transcriber to know that the correct spelling in your case used a G rather than a J - the J is far more common in my experience. "Jim" would be an aberration that could be improved, but substitution "Jem" for "Gem" without any context to suggest the latter would be just fine IMO.