Remix.run Logo
Sharlin 2 days ago

  def hash(str):
    len(str)
O(1), baby!
zimpenfish 2 days ago | parent | next [-]

O(1) only if you're working in a language with length-stored strings (like Pascal[0]), right?

In something like C with its generic strings[1], it would surely have to be O(n) since you have to scan the entire string to calculate its length?

(I have always been terrible at big-O, mind.)

[0] There's probably more of them by now.

[1] ie. not a specific length-stored string type.

Sharlin 2 days ago | parent [-]

Yes. But that's a known misfeature of C and no other language does it like that. Plus I kind of meant arbitrary byte strings where you can have embedded zeroes and thus have to know the length.

eru 2 days ago | parent | prev | next [-]

You'd probably want 'return len(str)', if this is Python?

In any case, for some applications this is indeed a great hash function. Programs like rmlint use it as part of their checks for duplicate files: if files have different lengths, they can't have the same content after all.

Sharlin 2 days ago | parent | next [-]

Yes, too much Rust :D Eliding the `return` becomes so intuitive after a while that you sort of forget that most other languages require it.

eru 2 days ago | parent [-]

Haskell and the Lisps also work like this.

Sharlin 2 days ago | parent [-]

Yes, I know, it's the natural way to do it in functional programming. Honestly I doubt there are any FP languages that don't do it like that.

eru a day ago | parent [-]

I like that Rust, just like Lisp and Haskell et al also allow any block to return a value, not just functions. So if-then-else and loops etc can do that. It be nice if Python could do that.

Perhaps I should hack up Python to allow that. Would be an interesting little project.

andai 2 days ago | parent | prev [-]

Re: missing return keyword

Actually, in Python, None is a valid key...

(I'm so sorry. JavaScript has ruined me.)

tetha 2 days ago | parent | next [-]

Any `return c` for some constant is a valid and correct hash function. It just has a lot of collisions and degenerates hash-maps to terrible performance. That was in fact my first thought when I read "simplest hash functions".

eru 2 days ago | parent | prev [-]

That's why I said "probably".

nkrisc 2 days ago | parent | prev [-]

If you use the identity function as your hashing function then is it O(0) because you are done before you start?

Sharlin 2 days ago | parent [-]

For an arbitrarily long input, you still have to compress it to constant size somehow.

nkrisc 2 days ago | parent [-]

Or pad all entries with 0s to an arbitrarily long size. The 0s can be assumed, and not actually stored. Therefore arbitrarily long entries need not be shortened.

Sharlin 2 days ago | parent [-]

I don't think there are any reasonable use cases for a non-constant-length hash.