Remix.run Logo
sirwhinesalot 3 days ago

Don't compute first and follow sets. Just branch on the current token. It is trivially unambiguous since 1 token = 1 branch. Expressions can be dealt with using precedence climbing / pratt, which still just amounts to branching on the current token after the "lhs" has been computed.

If the language doesn't fit this LL1 + operator precedence mold then I would not use a recursive descent parser.

ufo 3 days ago | parent [-]

The whole issue is that, without computing first and follow sets, it's easy to make a mistake and think that the grammar is LL(1) when it actually isn't. When that happens, the if statement does you no good. It'll be deterministic, but will silently mask the ambiguity.

sirwhinesalot 3 days ago | parent [-]

The only case I can think of is something like the dangling else problem or similar, which is very easy to detect.

You need to have a recursive call to the same rule or a "parent" rule, followed by an optional token match (note that this is already more than a simple branch on the current token since it is effectively a 1 token backtrack).

If there's any extra token match inbetween (like mandatory {}) you've already dodged the problem.

So I agree a mistake is possible, but "easy" I would not call it. It only appears in very specific conditions. The issue is way more prevalent in general PEGs.