Remix.run Logo
nine_k 6 hours ago

Indent-based syntax is relatively simple to parse. You basically need two pieces of state: are you in indent-sensitive mode (not inside a literal, not inside a parenthesized expression), and what indentation did the previous line have. Then you can easily issue INDENT and DEDENT tokens, which work exactly like "{" and "}". The actual Python parser does issue these tokens.

Actually Haskell has both indent-based and curlies-based syntax, and curlies freely replace indentation, and vice versa (but only as pairs).

pansa2 4 hours ago | parent [-]

> You basically need two pieces of state

That’s enough for INDENT, but for DEDENT you also need a stack of previous indentation levels. That’s how, when the amount of indentation decreases, you know how many DEDENTs to emit.

The requirement for a stack means that Python’s lexical grammar is not regular.