| ▲ | ieviev 3 hours ago | |||||||
It goes from start of the first match to the longest "alive" end, in practice it will go to a dead state and return after finding the match end. there's an implicit `.*` in front of the first pass but i felt it would've been a long tangent so i didn't want to get into it. so given input 'aabbcc' and pattern `b+`, first reverse pass (using `.*b+`) marks 'aa|b|bcc'<- the forward pass starts from the first match: 'aa->b|b|cc' marking 2 ends then enters a dead state after the first 'c' and returns the longest end: aa|bb|cc i hope this explains it better | ||||||||
| ▲ | masfuerte 2 hours ago | parent [-] | |||||||
Cheers. I was more confused by how you were doing multiple matches. So I read the paper, which describes the AllEnds algorithm. If I understand correctly, the reverse pass captures all of the match starts and these need to be remembered for the forward pass. Which is what you were showing above, but I didn't follow it. So, once it gets going, a traditional engine can produce matches iteratively with no further allocation, but RE# requires allocation proportional to the total number of matches. And in return, it's very much faster and much easier to use (with intersection and complement). | ||||||||
| ||||||||