▲ | demurgos 5 days ago | |||||||
> The ECMAScript/JavaScript language itself, however, exposes characters according to UCS-2, not UTF-16. The native JS semantics are UCS-2. Saying that it's UTF-16 is misleading and confuses charset, encoding and browser APIs. Ladybird is probably implementing support properly but it's annoying that they keep spreading the confusion in their article. | ||||||||
▲ | dzaima 5 days ago | parent [-] | |||||||
It's not cleanly one or the other, really. It's UCS-2-y by `str.length` or `str[i]`, but UTF-16-y by `str.codePointAt(i)` or by iteration (`[...str]` or `for (x of str)`). Generally though JS's strings are just a list of 16-bit values, being intrinsically neither UCS-2 nor UTF-16. But, practically speaking, UTF-16 is the description that matters for everything other than writing `str.length`/`str[i]`. | ||||||||
|