Remix.run Logo
hyeonwho4 20 hours ago

초성 = initial sound (consonant) 중성 = middle sound (vowel) 종성 = final sound (consonant)

My understanding is that there are two possible unicode encodings of Korean, one of which (MacOS) is sound by sound instead of syllable by syllable (Windows). This is why Korean UTF-8 filenames from MacOS appear broken on modern Windows machines.

kijin 14 hours ago | parent [-]

Yeah, it's stupid that Windows can't normalize the two completely valid ways of expressing Hangul in Unicode. If they can process e + acute accent = é, they should be able to do ㄱ + ㅏ = 가.

Having said that, MacOS also made the strange choice of expressing Hangul using the Hangul Jamo (by sound) Unicode block even when there are equivalent precomposed symbols in the Hangul Syllables block. Encoding each sound individually takes up 2-3 times more storage, just like with accented characters in Latin. Besides, if you just list sounds and rely on them to be combined automatically, what do you do when you legitimately want to write a sequence of uncombined sounds, like ㄱㅏㅁ instead of 감?