Remix.run Logo
numpad0 2 days ago

What you've said is correct, but it also means Unicode strings containing CJKV characters become mildly corrupt if decoded without a "--interpret-as=<language>" option to change binary-glyph correspondence. That's just not what Unicode should stand for.

You should not need to keep or infer the language hint. I know it was always the officially sanctioned way and what developer engaged in i18n work has to live with. My point is NOT that you are wrong but that part of Unicode spec is wrong.