Remix clone Hacker News

new | show | ask | jobs Github

	▲	scosman 8 hours ago
		I’m a big fan of whisperKit for this, and they just added TTS. Great because they support features like speaker diarization (“who spoke when”) and custom dictionaries. Here’s a load test where they run 4 models in realtime on same device: - Qwen3-TTS - text to speech - Parakeet v2 - Nvidia speech to text model - Canary v2 - multilingual / translation STT - Sortformer - speaker diarization (“who spoke when”) https://x.com/atiorh/status/2027135463371530695