Remix.run Logo
Sem: New primitive for code understanding – not LSPs, but entities on top of Git(ataraxy-labs.github.io)
26 points by rohanucla 3 hours ago | 8 comments
andai 7 minutes ago | parent | next [-]

  $ sem impact authenticateUser

  ⊕ function authenticateUser (src/auth/login.ts:26)

    → depends on:    db.findUser, rateLimiter.check
    ← used by:       loginRoute, authMiddleware
    ! 42 entities transitively affected
    ᛋ 7 tests affected
Okay that is pretty cool. I appreciate this information as a human also.

I got about halfway through reinventing something like this last year (minus the git part). I was trying to make a graph of dependencies in the codebase. (I actually got pretty far with a regex!)

hankbond an hour ago | parent | prev | next [-]

I am interested in subtle ways in which we can change how we write software to get better outcomes out of harnesses (model + tools + skills). I'm imagining that use of Sem will be more effective on code written in some shapes than others.

Can you describe what ways this might be beyond just breaking up code into smaller functions?

An example of this is that Models tend to create unit tests that are mostly just mock + reimplementations of imperative code in the functions they test. If you could force behavioral testing by only allowing test creation agents to accessing the function docstring, name/args/types, branch statements and log events, you could potentially avoid these classes of weak tests being created. But that would mean that your code has to optimize to providing signal via those elements.

This is just an example I'm not sure that would actually work.

hankbond 40 minutes ago | parent [-]

Also I keep seeing solutions in this space that are doing inheritance and call stack dependency linkage, but I haven't seen the same level of exploration into data lifetime dependency. Not lifetime in the way it exists in Rust (to my knowledge), but like including when you copy data and transform that copy. The motivation is "if I change this variable, enumerate all the areas that change would propagate to". The idea is similar, to evaluate blast radius of modifications. Ideally something like this could make refactoring more token efficient and consistent as well.

I don't know if you can reliably do that with static analysis tho. I would be interested in some sort of debug attachment like process that does a code coverage type evaluation. If you can't tell this is at least on the edge of (if not past) my depth of expertise

awoimbee 29 minutes ago | parent | prev | next [-]

The benchmarks aren't great, they're super specific to sem's output: why would I ask Claude how many "entities" were modified by a commit and do I need a tool specifically for this request ? Note that an "entity" is a sem-specific concept...

rohanucla 26 minutes ago | parent [-]

Thanks for pointing it out. I agree with you here, my testing process was quite specific to sem's output but also would love any suggestion from you of how you would design the whole testing process for this kind of tool?

I can also give my thought process, because I was more interested in figuring out the model's inherent search results and understanding without sem.

qudat 43 minutes ago | parent | prev | next [-]

I really like this idea and have been experimenting with it over a week or so.

I think there’s an opportunity to use an AST diff system for code forges where you don’t present the user with line diffs in the UI — or at least not as the first diff the user sees.

I firmly believe code review should happen in your editor.

Animats an hour ago | parent | prev [-]

Is this for checking what Claude Code just did to your repo?

rohanucla an hour ago | parent [-]

It can do that, but that's a small slice of what it does. sem parses your codebase into entities (functions, classes, methods) and builds a dependency graph across files.

So instead of line level analysis the whole granularity of seeing changes and tracking thing shifts to entities. It helps in attention mapping of your agent and lets you track the changes faster.

LSPs have been doing it for quite long but using treesitters is faster even tho type awareness is not great with this approach but overall working across multiple languages with a single tool can be quite helpful.