▲ | afiori 7 days ago | |
The main issue I can see is not garbage bytes in text but mixing of incompatible encoding eg splicing latin-1 bytes in a utf-8 string. My understanding of the current "always and only utf-8/unicode" zeitgeist is that is comes mostly from encoding issues among which the complexity of detecting encoding. I think that the current status quo is better than what came before, but I also think it could be improved. |