Pet subject of the week here.

Big choices are handrolled recursive decent vs LALR, probably backed by bison or lemon generator and re2c for a lexer.

Passing the lalr(1) check, i.e. having bison actually accept the grammar without complain about ambiguities, is either very annoying or requires thinking clearly about your language, depending on your perspective.

I claim that a lot of the misfires in language implementations are from not doing that work, and using a hand rolled approximation to the parser you had in mind instead, because that's nicer/easier than the formal grammar.

The parser generators emit useless error messages, yes. So if you want nice user feedback, that'll be handrolled in some fashion. Sure.

Sometimes people write a grammar and use a hand rolled parser, hoping they match. Maybe with tests.

The right answer, used by noone as far as I can tell, is to parse with the lalr generated parser, then if that rejects your string because the program was ill formed, call the hand rolled one for guesswork/diagnostics. Never feed the parse tree from the hand rolled parser into the rest of the compiler, that way lies all the bugs.

As alternative phrasing, your linter and your parser don't need to be the same tool, even if it's convenient in some senses to mash them together.

▲

mrkeen 3 days ago | parent [-]

> parse with the lalr generated parser, then if that rejects your string because the program was ill formed, call the hand rolled one for guesswork/diagnostics

This feels like a recipe for disaster. If the hand-rolled parser won't match a formal grammar, why would it match the generated parser?

The poor programmer will be debugging the wrong thing.

It reminds me of my short stint writing C++ where I'd read undefined memory in release mode, but when I ran it under debug mode it just worked.

	▲	senkora 3 days ago \| parent \| next [-]
		> It reminds me of my short stint writing C++ where I'd read undefined memory in release mode, but when I ran it under debug mode it just worked. I assume it’s far too late at this point, but that almost always means that you’re invoking UB. Your next step should be enabling UBSan.
	▲	JonChesterfield 3 days ago \| parent \| prev \| next [-]
		The generated parser will match the grammar. The hand rolled parser might do, but also might not, what with software being difficult and testing being boring and so forth.
	▲	8n4vidtmkvmk 3 days ago \| parent \| prev [-]
		There's risk, but it seems like you could run both parsers against the same unit tests to help mitigate.