▲ | burntsushi 5 days ago | ||||||||||||||||||||||||||||||||||
Good write-up. This is a very popular approach to substring search! It is still worst case `O(m*n)` though. Do you have a fallback implementation like the `memchr` crate has to guarantee `O(m+n)`? I'll also push back on some bits in the end:
Does Zig not have a way to specialize this for sequences of unsigned 8-bit integers? If not, and you're thereforce force to used a more generic algorithm, that seems pretty unfortunate.
Oh I'm not sure I buy this at all! Substring search is a primitive operation and easily can be a bottleneck. There's a reason why widely used substring search implementations tend to be highly optimized. | |||||||||||||||||||||||||||||||||||
▲ | aarol 5 days ago | parent | next [-] | ||||||||||||||||||||||||||||||||||
I'm the author of the post, thanks for your feedback! I was inspired by your comment on HN a while back and started learning about this stuff, reading the source code of `memchr` was especially great. You're totally right about the first part there was a serious consideration to add this to zig's standard library, there would definitely need to be a fallback to avoid the `O(m*n)` situation. I'll admit that there are a lot of false assumptions at the end, you could totally specialize it for u8 and also get the block size according to CPU features at compile time with `std.simd.suggestVectorSize()` | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
▲ | ozgrakkurt 5 days ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||
It is very easy to specialise a function in zig, you just put if(T == u8) or something like that inside the function and do w/e in there | |||||||||||||||||||||||||||||||||||
|