Remix.run Logo
Cloudef 4 days ago

Indeed, one pain point of unicode is CJK unification. https://heistak.github.io/your-code-displays-japanese-wrong/

asddubs 4 days ago | parent [-]

the fact that there is seemingly no interest in fixing this, and if you want chinese and japanese in the same document, you're just fucked, forever, is crazy to me.

They should add separate code points for each variant and at least make it possible to avoid the problem in new documents. I've heard the arguments against this before, but the longer you wait, the worse the problem gets.

Cloudef 3 days ago | parent | next [-]

Afaik theres some language hints nowadays but its kinda hack

meindnoch 3 days ago | parent | prev | next [-]

What happens if you want both single-storey "a" and double-storey "a" in the same document? You use a different font.

asddubs 3 days ago | parent | next [-]

I won't even touch the fact that what you're talking about is just a stylistic difference, rather than a language based one, and will instead say this: What if you want the cyrillic letter А and the latin letter A, which are not just the same glyph, but literally visually identical looking in the same document? Oh wait both of those have separate UTF-8 codepoints. But if you want chinese and japanese characters which do not look identical in the same document, you have to resort to changing fonts? What if you're using an encoding that doesn't support specifying fonts? Your non-response doesn't solve anything and helps no one

eviks 3 days ago | parent | prev [-]

Some fonts allow for both alternatives in them

eviks 3 days ago | parent | prev [-]

Why is the language tag not used to signal a variant?

jabedude 3 days ago | parent [-]

That doesn't help in a mixed Chinese-Japanese document

eviks 3 days ago | parent [-]

Why not? You don't have a single tag limit per document and can tag every mixed part with the appropriate language

jabedude 2 days ago | parent [-]

That's not the only granularity of mixed text. A Chinese textbook about the Japanese language will have sentences where the languages are mixed

eviks a day ago | parent [-]

You still haven't explained what the issue is

Chinese textbook: <ch>Chinese <jp>Mixed Japanese</jp> continue Chinese.</ch>