| ▲ | omoikane 15 minutes ago | |
UTF-8 has the same issue ("overlong encoding") where multiple representations are possible the same code point. Someone proposed a similar tweak to remove the overlapping ranges by adjusting the base offset for byte sequences that are longer than 1. That was discussed here: https://news.ycombinator.com/item?id=44456073 - Corrected UTF-8 (2025-07-03, 54 comments) This "corrected UTF-8" has other problems, but I thought it's interesting how the shifted-offset idea carries over. | ||