Remix.run Logo
jiggawatts 7 days ago

I was curious to see what other languages do by default (or not), so I quickly tested a few:

C# evaluates the lambda only once per item, in input order: https://dotnetfiddle.net/oyG4Cv

Java uses "Comparators", which are called repeatedly: https://www.jdoodle.com/ia/1IO5

Rust has sort_by_key() which is encouraging... but also calls the key transformation many times: https://play.rust-lang.org/?version=stable&mode=debug&editio...

C++20 with std::range also calls the lambda repeatedly: https://godbolt.org/z/raa1766PG

Obviously, the Schwartzian Transform can be manually implemented in any modern language elegantly, but it's interesting that only C# uses it implicitly!

PS: The fancy solution would be a temporary "Hashset<T,K>" used to map values to the transformed sort keys, so that the mapping Foo(T t): K would only be called once for each unique input value 't'. That way the transformation could be invoked less than O(n) times!

ynzoqn 7 days ago | parent | next [-]

Inspired by your behavior, I looked up the Wikipedia article on Schwartzian transform[1]. The article mentioned that Rust not only provides sort_by_key, but also sort_by_cached_key. :D

[1]: https://en.wikipedia.org/wiki/Schwartzian_transform

jiggawatts 7 days ago | parent [-]

I keep saying to colleagues that simply knowing that something exists in a library somewhere is 90% of the skill of modern programming.

I suspect the profession of software archeology form the Vernor Vinge science fiction novels will soon become a reality, especially now that old languages can be uplifted mechanically with AI.

Sharlin 7 days ago | parent [-]

This one is particularly easy to chance upon, though, by either browsing the docs of slices’ `sort*` methods (because they’re all clumped together on the page), or even simpler, noticing it when you type `foo.sort` and your editor suggests autocompletions.

Sharlin 7 days ago | parent | prev [-]

The reason Rust has a separate `sort_by_cached_key` is, of course, that it doesn’t make sense to cache cheap projections which are the 95% use case.

The implementation is somewhat more sophisticated than just the naive transform:

> The current implementation is based on instruction-parallel-network sort by Lukas Bergdoll, which combines the fast average case of randomized quicksort with the fast worst case of heapsort, while achieving linear time on fully sorted and reversed inputs. And O(k * log(n)) where k is the number of distinct elements in the input. It leverages superscalar out-of-order execution capabilities commonly found in CPUs, to efficiently perform the operation.

> In the worst case, the algorithm allocates temporary storage in a Vec<(K, usize)> the length of the slice.

jiggawatts 7 days ago | parent [-]

Indeed, that works the same as the default C# sort: https://play.rust-lang.org/?version=stable&mode=debug&editio...