Remix.run Logo
jack_pp 3 days ago

maybe we need LLMs trained on ASTs or create a new symbolic way to represent software that's faster to grok by LLMs and have a translator so we can verify the code

energy123 3 days ago | parent [-]

You could probably build a decent agentic harness that achieves something similar.

Show the LLM a tree and/or call-graph representation of your codebase (e.g. `cargo diagram` and `cargo-depgraph`), which is token efficient.

And give the LLM a tool call to see the contents of the desired subtree. More precise than querying a RAG chunk or a whole file.

You could also have another optional tool call which routes the text content of the subtree through a smaller LLM that summarizes it into a maximum density snippet, which the LLM can use for a token efficient understanding of that subtree during early the planning phase.

But I'd agree that an LLM built natively around AST is a pretty cool idea.