Remix.run Logo
d4rkp4ttern 21 hours ago

This was on HN 7 months ago:

https://news.ycombinator.com/item?id=45114245

Every time a STT/TTS model is posted I wonder if it will change my current workflow on MacOS, which is:

STT with Parakeet-V3 via Hex [1] app for near-instant good-enough transcription for talking to AI agents.

TTS using KyutAI’s Pocket-TTS, an amazing 100M-param model. I used this to make a "voice" plugin [2] for Claude Code

So far I haven’t seen anything that replaces these for me, or haven't been persuaded enough to spend time testing an alternative (explore/exploit and all that).

[1] Hex STT app - https://github.com/kitlangton/Hex, which is macOS-only. (also good free/OSS alternatives: Handy, VoiceInk. No need for Wispr, Superwhisper etc)

[2] Claude Code Voice Plugin - https://pchalasani.github.io/claude-code-tools/plugins-detai...

steinvakt2 15 hours ago | parent [-]

What do you consider to be the model with highest accuracy?

d4rkp4ttern 14 hours ago | parent [-]

I guess you mean for STT. For my usecase of talking to AI's or coding agents, pure STT accuracy is less important than transcription speed. Transcription needs to be near-instant, and accuracy "good enough" so that the AI's can "read between the lines". Parakeet-V3 gives exactly this.