| ▲ | giancarlostoro a day ago | |||||||
You're right, in my rage I typod, its really frustrating, even friends will text me and their text makes no sense, and 2 minutes later "STUPID VOICE TO TEXT" I have a few friends who drive trucks, so they need to be able to use their voice to communicate. | ||||||||
| ▲ | delecti a day ago | parent | next [-] | |||||||
Better speech transcription is cool, but that feels kinda contrived. Phone calls exist, so do voice messages sent via texting apps, and professional drivers can also just wait a bit to send messages if they really must be text; they're on the job, but if it's really that urgent they can pull over. | ||||||||
| ||||||||
| ▲ | wolvoleo a day ago | parent | prev [-] | |||||||
I have to say that OpenAI's Whisper model is excellent. If you could leverage that somehow I think it would really improve. I run it locally myself on an old PC with 3060 card. This way I can run whisper large which is still speedy on a GPU especially with faster-whisper. Added bonus is the language autodetection which is great because I speak 3 languages regularly. I think there's even better models now but Whisper still works fine for me. And there's a big ecosystem around it. | ||||||||
| ||||||||