Remix clone Hacker News

new | show | ask | jobs Github

	▲	walthamstow 2 days ago
		Seems quite heavy for a STT model, Parakeet and Whisper are much smaller and perform great for quick dictation and transcription of longer files. I guess that's due to additional accuracy and speaker diarisation? The TTS example clip in the repo of 'spontaneous singing' is creepy as fuck