▲ | ninkendo 2 days ago | |
The file manager wouldn’t use the “string” type to hold file names, if it’s written properly. Languages like Rust have things like OsString as separate from String for just this reason. If you have a type that says “my contents are valid UTF-8”, then you should reject invalid UTF-8 when populating it, obviously. Why would it work any other way? If you need a type that can hold arbitrary byte sequences, use a type that can hold arbitrary byte sequences. | ||
▲ | account42 a day ago | parent [-] | |
This is an unrealistic expectation. Local file names are just one example of many where you need to deal with UTF-8ish data that you should interpret as UTF-8 for display but pass along unmangled to other systems. Storing all that data twice and duplicating all relevant operations is both inefficient and will introduce more bugs as the two strings get out of sync. The gains from enforcing strict UTF-8 validation are minimal while the downsides are many - not the least of which is intentionally breaking forward compatibility with future Unicode versions that may extend what is valid. It's is also not what happens in practice. File managers that cannot rename or delete some files because they are unnecessarily "smart" about interpreting strings passed to them is very much how things have worked out in reality. |