Remix.run Logo
rs545837 5 hours ago

We've been building an open source tool called sem (https://github.com/ataraxy-labs/sem) that takes this one level further: entity-level diffs instead of AST-level.

Instead of showing you which syntax nodes changed, it shows you which functions, classes, and methods changed, classifies the change (text-only, syntax, functional), and walks a dependency graph to tell you the blast radius.

The delta + difftastic integration problem in that issue is interesting because sem already has the pieces both sides need, before/after content with full context for every changed entity, plus structured JSON output. The blocker in #535 is that difftastic's JSON doesn't include surrounding context. sem's output includes complete entity bodies by default.

Would love to collaborate on a common interchange format if anyone from the delta or difftastic projects is interested. Entity-level granularity sits naturally above AST-level diffs and below file-level diffs, and having a standard way to represent "what changed and what depends on it" would be useful for the whole ecosystem.

esafak 5 hours ago | parent | next [-]

It tells you the function changed but not how; you still need line-level diffs.

rs545837 4 hours ago | parent [-]

Right, sem gives you both. sem diff --verbose shows the full before/after body of each changed entity. The entity-level view tells you what changed and what's affected. The line-level detail is still there when you need it.

dominotw 5 hours ago | parent | prev [-]

can diffs be piped through an llm to give you something higher level but still tie it back to to changes

rs545837 4 hours ago | parent [-]

You can, but it's slow, expensive, and hallucinates. An LLM looking at a raw diff might miss a renamed function or invent a dependency that doesn't exist. sem does it structurally: parses both sides with tree-sitter, computes structural hashes, walks the real dependency graph. If you want to layer an LLM on top for summarization, you're feeding it 10 entities instead of 500 lines of unified diff.