Remix.run Logo
phoronixrly 7 hours ago

Let's indeed limit the use case to the system language, let's say of a mobile phone.

You pull up a map and start navigation. All the street names are in the local language, and no, transliterating the local names to the English alphabet does not make them understandable when spoken by TTS. And not to mention localised foreign names which then are completely mangled by transliterating them to English.

You pull up a browser, open up an news article in your local language to read during your commute. You now have to reach for a translation model first before passing the data to the English-only TTS software.

You're driving, one of your friends Signals you. Your phone UI is in English, you get a notification (interrupting your Spotify) saying 'Signal message', followed by 5 minutes of gibberish.

But let's say you have a TTS model that supports your local language natively. Well due to the fact that '1.5B English speakers' apparently exist in the planet, many texts in other languages include English or Latin names and words. Now you have the opposite issue -- your TTS software needs to switch to English to pronounce these correctly...

And mind you, these are just very simple use cases for TTS. If you delve into use cases for people with limited sight that experience the entire Internet, and all mobile and desktop applications (often having poor localisation) via TTS you see how mono-lingual TTS is mostly useless and would be switched for a robotic old-school TTS in a flash...

> only that but it's also common to have system language set to English

Ask a German whether their system language is English. Ask a French person. I can go on.

numpad0 an hour ago | parent [-]

If you don't speak the local language anyway, you can't decode pronounced spoken local language names anyway. Your speech sub-systems can't lock and sync to the audio track containing languages you don't speak. Let alone transliterate or pronounce.

Multilingual doesn't mean language agnostic. We humans are always monolingual, just multi-language hot-swappable if trained. It's more like you can make;make install docker, after which you can attach/detach into/out of alternate environments while on terminal to do things or take in/out notes.

People sometimes picture multilingualism as owning a single joined-together super-language in the brain. That usually doesn't happen. Attempting this especially at young age could lead a person into a "semi-lingual" or "double-limited" state where they are not so fluent or intelligent in any particular languages.

And so, trying to make an omnilingual TTS for criticizing someone not devoting significant resources at it, don't make much sense.