Remix.run Logo
numpad0 a day ago

Isn't that a bit much for ASR models? Humans can't handle simultaneous multilingual dictation task either, I have to stop and reinitialize ears before switching languages between English and my primary one.

abdullahkhalids 17 hours ago | parent | next [-]

In South Asia, it's quite common for people to speak a combination of their local language and English. Not just alternating sentences between the two languages, but in fact, constructing sentences using compound phrases from the two languages.

"Madam, please believe me, maine homework kiya ha" [I did my homework].

bondarchuk a day ago | parent | prev | next [-]

Seems like it already has the capability somewhere in the model though - see my reply to clarionbell.

cenamus 20 hours ago | parent | prev [-]

Isn't that exactly what intepreters do?

numpad0 19 hours ago | parent [-]

If they're like what I am, they seem to just coordinate constant staggered resets for sub-systems of language processing pipeline while keeping internal representations of inputs in half-text state so that input come back out through the pipeline in the other configurations.

That's how I anecdotally feel and interpret how my own brain appear to work, so it could be different from how interpreters work or how actual human brains work, but as far as I see it, professional simultaneous interpreters don't seem to be agnostic for relevant pairs of languages at all.