| ▲ | capitainenemo 2 days ago | |||||||||||||||||||||||||
Article claims python 3 uses UTF-8. https://stackoverflow.com/questions/1838170/ "In Python 3.3 and above, the internal representation of the string will depend on the string, and can be any of latin-1, UCS-2 or UCS-4, as described in PEP 393." Article also says PHP has immutable strings. They are mutable, although often copied. Article also claims majority of popular languages have immutable strings. As well as the ones listed there is also PHP and Rust (and C, but they did say C++ - and obviously Ruby since that's the subject of the article). I'm also a bit surprised by the last sentence. "However, if you do measure a negative performance impact, there is no doubt you are measuring incorrectly." There must surely be programs doing a lot of string building or in-place modification that would benefit from non-frozen. | ||||||||||||||||||||||||||
| ▲ | byroot a day ago | parent | next [-] | |||||||||||||||||||||||||
> There must surely be programs doing a lot of string building or in-place modification that would benefit from non-frozen. The point is that the magic comment (or the --enable-frozen-string-literal) only applies to literals. If you have some code using mutable strings to iteratively append to it, flipping that switch doesn't change that. It just means you'll have to explicitly create a mutable string. So it doesn't change the performance profile. | ||||||||||||||||||||||||||
| ▲ | chrismorgan a day ago | parent | prev | next [-] | |||||||||||||||||||||||||
Python strings aren’t even proper Unicode strings. They’re sequences of code points rather than scalar values, meaning they can contain surrogates. This is incompatible with basically everything: UTF-* as used by sensible things, and unvalidated UTF-16 as used in the likes of JavaScript, Windows wide strings and Qt. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
| ▲ | ameliaquining a day ago | parent | prev | next [-] | |||||||||||||||||||||||||
In C, C++, and Rust, the question of "are strings in this language mutable or immutable?" isn't applicable, because those languages have transitive mutability qualifiers. So they only need a single string type, and whether you can mutate it or not depends on context. (C++ and Rust have multiple string types, but the differences among them aren't about mutability.) In languages without this feature, a given value is either always mutable or never mutable, and so it's necessary to pick one or the other for string literals. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
| ▲ | byroot a day ago | parent | prev [-] | |||||||||||||||||||||||||
> can be any of latin-1, UCS-2 or UCS-4, as described in PEP 393 My bad, I haven't seriously used Python for over 15 years now, so I stand corrected (and will clarify the post). My main point stands though, Python strings have an internal representation, but it's not exposed to the user like Ruby strings. > Article also says PHP has immutable strings. They are mutable, although often copied. Same. Thank you for the correction, I'll update the post. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||