Remix.run Logo
rzzzwilson 7 hours ago

Where is the "python syntax"?

hresvelgr 6 hours ago | parent | next [-]

I suspect that was in the initial prompt that was used to generate this and the LLM decided Rust syntax was preferable.

metadat 6 hours ago | parent [-]

Yes, it looks almost exactly like Rust. Expectations violation! :)

AGDNoob 7 hours ago | parent | prev [-]

Yeah that's fair. It's got "fn main()", types like "i32", and uses braces. More Rust-like than Python to be honest. The "Python-like" part is mostly wishful thinking about readability. Should've just called it "minimalist systems language" or something

rzzzwilson 7 hours ago | parent [-]

I was hoping for no {}, just indentation, but ...

nine_k 6 hours ago | parent | next [-]

Indent-based syntax is relatively simple to parse. You basically need two pieces of state: are you in indent-sensitive mode (not inside a literal, not inside a parenthesized expression), and what indentation did the previous line have. Then you can easily issue INDENT and DEDENT tokens, which work exactly like "{" and "}". The actual Python parser does issue these tokens.

Actually Haskell has both indent-based and curlies-based syntax, and curlies freely replace indentation, and vice versa (but only as pairs).

pansa2 4 hours ago | parent [-]

> You basically need two pieces of state

That’s enough for INDENT, but for DEDENT you also need a stack of previous indentation levels. That’s how, when the amount of indentation decreases, you know how many DEDENTs to emit.

The requirement for a stack means that Python’s lexical grammar is not regular.

AGDNoob 6 hours ago | parent | prev [-]

Yeah braces made the parser way simpler for a first attempt. Significant whitespace is on the maybe-list but honestly seems scary to implement correctly

zahlman 6 hours ago | parent [-]

I feel like Python-style indentation should be much easier to parse intuitively (preprocess the line, count leading levels of indentation) than by fully committing to formal theory. Not theoretically optimal and not "single-pass" but is that really the bottleneck?

AGDNoob 6 hours ago | parent [-]

Yeah, that’s fair. Conceptually it’s not that hard if you’re willing to do a proper preprocess pass and generate INDENT and DEDENT tokens. For this first version I mostly optimized for not shooting myself in the foot, braces gave me very explicit block boundaries, simpler error handling, and a much easier time while bringing up the compiler and codegen. Significant whitespace is definitely interesting long term, but for a v0 learning project I wanted something boring and robust first. Once the core stabilizes, revisiting indentation based blocks would make a lot more sense

zahlman 5 hours ago | parent [-]

Fair enough.

Might I suggest that now is a good time to try and make a concrete wish-list of syntax features you'd like to see, and start drafting examples of how you'd like the code to look?