Remix.run Logo
hrmtst93837 10 hours ago

I think for Python entity-level merging has to treat indentation as structural rather than cosmetic, because a shifted indent can change which block a statement belongs to. In my experience the pragmatic approach is to parse into an AST with a tolerant parser like parso or tree-sitter, perform a 3-way AST merge that matches functions and classes by name and signature, then reserialize while preserving comment and whitespace spans. The practical tradeoff is that conflicted code is often syntactically invalid, so you need error tolerant recovery or a token-level fallback that normalizes INDENT and DEDENT tokens and runs an LCS style merge on tokens when AST matching fails. I've found combining node-matching heuristics with a lightweight reindent pass cuts down the number of manual fixes, but you still get a few gnarly cases when someone renamed a symbol and moved its body in the same commit.

rs545837 9 hours ago | parent [-]

Really appreciate the detail here, this is clearly hard-won experience. I agree that indentation is structural in Python.