▲ | account42 7 months ago | |
> > Or unpaired surrogates. > That’s just an invalid Unicode string, then. Unicode strings are sequences of Unicode scalar values, not code points. Because surrogates were retrofitted onto UCS-2 to make it into UTF-8, they are both code units and (reserved) code points. |