▲ | lifthrasiir 3 days ago | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Eh, Han unification was an one-off decision. Now many (but not all) characters have been disunified as needed, like the infamous Biang character [1] which received two different code points. Of course common characters are much less likely to be disunified, because at this point many decades have been passed after the initial encoding and any disunification would cause compatibility issues. [1] https://en.wikipedia.org/wiki/Biangbiang_noodles#Unicode | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | numpad0 3 days ago | parent [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
It's an upheld decision. The unification is not about reducing character counts overall, but to co-mingle CJKV languages. Adding more characters is not un-mingling existing characters. One thing I feared might happen and do seem to be happening is, Chinese LLMs and AI projects seem to be moving towards Chinese-English bilingual models away from regular omni-lingual models, which, I think is, because LLMs would become confused with Chinese-invalid syntaxes and dictionary definitions, and/or generally perform worse, if substantial non-Chinese CJKV data was included in the dataset. At the polar opposite of computing, Hollow Knight: Sliksong released just days prior is having Han Unification font problem as well: as you might know, thanks to Han Unification, CJKV languages each require its own font, of which no two cannot be active at the same time, and characters become mangled if application developer spends substantial cost implementing such non-standard feature. The developers was not aware of that, and did not spend extra cost doing so, and is getting review bombed in China. It just needs to be reversed. It's a real problem. Adding more obscure characters and obscure features is tangential and not a solution. Different isolated clusters of characters uses need to be separated, not overlapped into one same area, like there are no "GermanFrench-English dictionary". | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|