▲ | moefh 4 days ago | |||||||
It's not 2 million, it's a little over 1 million. The exact number is 1112064 = 2^16 - 2048 + 16*2^16: in UTF-16, 2 bytes can encode 2^16 - 2048 code points, and 4 bytes can encode 16*2^16 (the 2048 surrogates are not counted because they can never appear by themselves, they're used purely for UTF-16 encoding). | ||||||||
▲ | chuckadams 3 days ago | parent [-] | |||||||
Even with just 1 million codepoints, why did they feel the need for CJK unification? Was it so it would all fit in UCS-2 or something? | ||||||||
|