Remix.run Logo
necovek 3 days ago

Sure, it's not perfect, and there are other issues. With UTF-8, you know exactly how many octets you need to read for the rest of the characters.

But for issue #2, that seems to not be too bad since you only need to look one byte backwards.

At the same time for #3, in the middle of UTF-8 bytestream, you need look backwards as well for anything but the ASCII (7-bit) codepoints too.