▲ | account42 7 days ago | |
It conceptually uses arrays of code points, which need up to 24 bits. Optimizing the storage to use smaller integers when possible is an implementation detail. | ||
▲ | jibal 7 days ago | parent | next [-] | |
Python3 is specified to use arrays of 8, 16, or 32 bit units, depending on the largest code point in the string. As a result, all code points in all strings are O(1) indexable. The claim that "Python 3 internally uses UTF-32" is simply false. | ||
▲ | zahlman 6 days ago | parent | prev [-] | |
> code points, which need up to 24 bits They need at most 21 bits. The bits may only be available in multiples of 8, but the implementation also doesn't byte-pack them into 24-bit units, so that's moot. |