| ▲ | lioeters 3 hours ago | |
> extra step to get from CST to AST Could you elaborate on what this involves? I'm also looking at using tree-sitter as a parser for a new language, possibly to support multiple syntaxes. I'm thinking of converting its parse trees to a common schema, that's the target language. I guess I don't quite get the difference between a concrete and abstract syntax tree. Is it just that the former includes information that's irrelevant to the semantics of the language, like whitespace? | ||
| ▲ | WilcoKruijer an hour ago | parent | next [-] | |
In this context you could say that CST -> AST is a normalization process. A CST might contain whitespace and comments, an AST almost certainly won't. An example: in a CST `1 + 0x1 ` might be represented differently than `1 + 1`, but they could be equivalent in the AST. The same could be true for syntax sugar: `let [x,y] = arr;` and `let x = arr[0]; let y = arr[1];` could be the same after AST normalization. You can see why having just the AST might not be enough for syntax highlighting. As a side project I've been working on a simple programming language, where I use tree-sitter for the CST, but first normalize it to an AST before I do semantic analysis such as verifying references. | ||
| ▲ | FjordWarden 3 hours ago | parent | prev | next [-] | |
TS returns a tree with nodes, you walk the nodes with a visitor pattern. I've experimented with using tree-sitter queries for this, but for now not found this to be easier. Every syntax will have its own CST but it can target a general AST if you will. At the end they can both be represented as s-expressions and but you need rules to go from one flavour of syntax tree to the other. AST is just CST minus range info and simplified/generalised lexical info (in most cases). | ||
| ▲ | direwolf20 3 hours ago | parent | prev [-] | |
That's correct. | ||