Remix.run Logo
duped a day ago

Start from the perspective of the user seeing effectively:

> error: expected the character ';' at this exact location

The user wonders, "if the parser is smart enough to tell me this, why do I need to add it at all?"

The answer to that question "it's annoying to write the code to handle this correctly" is thoroughly lazy and boring. "My parser generator requires the grammar to be LR(1)" is even lazier. Human language doesn't fit into restrictive definitions of syntax, why should language for machines?

> Because code is still read more than it is written it just doesn't seem correct to introduce ambiguity like this.

That's why meaningful whitespace is better than semicolons. It forces you to write the ambiguous cases as readable code.

estebank 11 hours ago | parent [-]

I used to hate semicolons. Then I started working in parser recovery for rustc. I now love semicolons.

Removing redundancy from syntax should be a non-goal, an anti-goal even. The more redundancy there is, the higher the likelihood of making a mistake while writing, but the higher the ability for humans and machines to understand the developer's intent unambiguously.

Having "flagposts" in the code lets people skim code ("I'm only looking at every pub fn") and the parser have a fighting chance of recovering ("found a parse error inside of a function def, consume everything until the first unmatched } which would correspond to the fn body start and mark the whole body as having failed parsing, let the rest of the compiler run"). Semicolons allow for that kind of recovery. And the same logic that you would use for automatic semicolon insertion can be used to tell the user where they forgot a semicolon. That way you get the ergonomics of writting code in a slightly less principled way while still being able to read principled code after you're done.

duped 11 hours ago | parent [-]

Why is ";" different from \n from the perspective of the parser when handling recovery within scopes? Similarly, what's different with "consume everything until the first unmatched }" except substituting a DEDENT token generated by the lexer?