▲ | trenchpilgrim a day ago | |||||||
Whisper has quite bad issues with hallucination. It will inject sentences that were never said in the audio. It's decent for classification but poor at transcription. | ||||||||
▲ | neckro23 18 hours ago | parent [-] | |||||||
Pre-processing with a vocal extraction model (bs-rofomer or similar) helps a lot with the hallucinations, especially with poor quality sources. | ||||||||
|