▲ | necovek 3 days ago | |
Sure, it's not perfect, and there are other issues. With UTF-8, you know exactly how many octets you need to read for the rest of the characters. But for issue #2, that seems to not be too bad since you only need to look one byte backwards. At the same time for #3, in the middle of UTF-8 bytestream, you need look backwards as well for anything but the ASCII (7-bit) codepoints too. |