Remix.run Logo
giancarlostoro a day ago

You're right, in my rage I typod, its really frustrating, even friends will text me and their text makes no sense, and 2 minutes later "STUPID VOICE TO TEXT" I have a few friends who drive trucks, so they need to be able to use their voice to communicate.

delecti a day ago | parent | next [-]

Better speech transcription is cool, but that feels kinda contrived. Phone calls exist, so do voice messages sent via texting apps, and professional drivers can also just wait a bit to send messages if they really must be text; they're on the job, but if it's really that urgent they can pull over.

jimbokun a day ago | parent [-]

They can also use paper maps instead of GPS.

wolvoleo a day ago | parent | prev [-]

I have to say that OpenAI's Whisper model is excellent. If you could leverage that somehow I think it would really improve. I run it locally myself on an old PC with 3060 card. This way I can run whisper large which is still speedy on a GPU especially with faster-whisper. Added bonus is the language autodetection which is great because I speak 3 languages regularly.

I think there's even better models now but Whisper still works fine for me. And there's a big ecosystem around it.

nomel a day ago | parent [-]

I wonder what the wattage difference is between the iPhone STT and Whisper? How many seconds would the iPhone battery last?