Remix clone Hacker News

new | show | ask | jobs Github

	▲	MattPalmer1086 2 days ago
		Hashchain is the easiest to implement, but like many search algorithms can suffer from quadratic performance on really bad data and patterns (e.g. seaching for a long sequence of zero bytes in a text of zero bytes). In practice this is very rare. If you want guaranteed linear performance in the worst case too, then LinearHashchain is the one to use. It is slightly more complex to implement as it builds in a KMP style verifier to make it linear (so almost two search algorithms are needed). It is actually about as fast as HashChain generally in the average case, so you don't lose out. The others are either experimental or quite niche and not suitable for most purposes. SentinelHashchain is actually the fastest, but relies on being able to add a copy of the pattern at the end of the search text. Mostly this won't be possible in most search contexts, unless you control all memory allocations. So I'd start with HashChain, and maybe play with the linear version later - most of it is the same, it just needs a bit more adding.