| ▲ | lifthrasiir 3 days ago |
| The unification, implemented in Unicode 1.1, is definitely a character count reduction mechanism. I'm very sure that if the decision to abandon 16-bit character set was done earlier then the unification wouldn't have happened. And I'm saying this as a CJKV person and past gamedev: CJKV languages each require its own font no matter whether the Han unification is implemented or not. There are simply too many glyphs there; not just unified characters, but also common characters that are not considered unified are also often varying across countries. If you account for all those glyph variations in a single font, you just can't cope up because OpenType only supports at most 65,536 glyphs in a single typeface. In the alternative universe OpenType may have been extended to allow more glyphs in a single typeface, I don't know, but CJKV characters are simply complex enough to require multiple font files in general. Han unification is of less concern when you have too many glyphs. |
|
| ▲ | numpad0 3 days ago | parent [-] |
| > not just unified characters, but also common characters that are not considered unified are also often varying across countries. That's the unification, the issues stemming from CJKVs each not having own code points. The issue is not that CJKVs need multiple font files and it's cumbersome, the issue is that no two CJKV fonts may be loaded at the same time because there are conflicting glyphs. Conflicting glyphs. That's just wrong. |
| |
| ▲ | lifthrasiir 3 days ago | parent | next [-] | | If you somehow want to display, say, both Japanese and Chinese texts at the same time, there is no technical obstacle that prevents you to do so. Pan-Unicode fonts come with differently named files for CJKV characters so that is not even difficult. Yes, your assets will have multiple multi-megabyte font files. Is that a problem for modern games? I don't think so. There is a single circumstance where this is not generally doable: a user name in globally serviced online games. (Guess why I know of this case...) Unless there is a hint that a particular user prefers one's user name to be displayed in a certain way, it is difficult to decide which font to use (or even which set of fonts to use). But it's a very niche problem and otherwise you know which language of the text you are showing and can pick the correct font from your assets. | | |
| ▲ | numpad0 2 days ago | parent [-] | | What you've said is correct, but it also means Unicode strings containing CJKV characters become mildly corrupt if decoded without a "--interpret-as=<language>" option to change binary-glyph correspondence. That's just not what Unicode should stand for. You should not need to keep or infer the language hint. I know it was always the officially sanctioned way and what developer engaged in i18n work has to live with. My point is NOT that you are wrong but that part of Unicode spec is wrong. |
| |
| ▲ | zahlman 2 days ago | parent | prev [-] | | > Conflicting glyphs. Which could be chosen between using variation selectors. | | |
| ▲ | numpad0 2 days ago | parent [-] | | I guess, but I've never heard there's a `cat text | ivs-convert --from=utf8 --to=zh-Hans` type of things. So practically almost non-existent. |
|
|