Remix.run Logo
JimDabell 4 days ago

Strings and Unicode are a lot more complicated than they first appear. I like the way this article puts it:

> Swift’s string implementation goes to heroic efforts to be as Unicode-correct as possible. […] This is great for correctness, but it comes at a price, mostly in terms of unfamiliarity; if you’re used to manipulating strings with integer indices in other languages, Swift’s design will seem unwieldy at first, leaving you wondering.

> It’s not that other languages don’t have Unicode-correct APIs at all — most do. For instance, NSString has the enumerateSubstrings method that can be used to walk through a string by grapheme clusters. But defaults matter; Swift’s priority is to do the correct thing by default.

> Strings in Swift are very different than their counterparts in almost all other mainstream programming languages. When you’re used to strings effectively being arrays of code units, it’ll take a while to switch your mindset to Swift’s approach of prioritizing Unicode correctness over simplicity.

> Ultimately, we think Swift makes the right choice. Unicode text is much more complicated than what those other languages pretend it is. In the long run, the time savings from avoided bugs you’d otherwise have written will probably outweigh the time it takes to unlearn integer indexing.

https://oleb.net/blog/2017/11/swift-4-strings/

I’d encourage you to read that entire article before describing strings as simple.

frollogaston 4 days ago | parent [-]

I'm well aware of all of this. Swift strings aren't random-access. There are reasons no other language did it Apple's way. Even in the rare situations when you do care about the edge cases with these multi-code point symbols (basically just emojis), Swift strings still make that a nightmare, while in other languages you have easy ways to deal with it.

I was on such a project where we cared a lot about these details, and the whole team agreed to throw Swift strings out the window and build our own array-based string replacement where each slot is a symbol. Which is probably what Apple would've done if it weren't for performance overhead.

Didn't help that their API was really unstable. Every major Swift version broke our code in so many places that we started adding extra layers just to protect ourselves.