Remix.run Logo
neutrinobro a day ago

Was this done by manually reviewing commit messages? I think it would be interesting/useful to have a tool that could use some basic heuristics about LLM generated code to detect code-blobs even if they are not explicitly called out in a commit message.

jonathrg a day ago | parent | next [-]

The diff of the linked commit in git is completely trivial, clearly it just got tagged because of the signoff in the commit message: https://github.com/git/git/commit/d7971544fe17378f44f4998301...

I would be surprised if there is no LLM-assisted code in there prior to this commit, this is just the first where the author chose to disclose it.

wrs a day ago | parent | prev | next [-]

Apparently, though not very carefully. The "particularly large LLM generated code churn" in the ram library, for example, is the LLM being used to simply git-revert a change that was not originally done by an LLM.

joeyh a day ago | parent [-]

The commit it reverted has a high probability of also being generated with an LLM, though without disclosing that in the commit message.

dijksterhuis a day ago | parent | prev | next [-]

when i was reading this i thought of writing some quick and dirty cli tool that checks commit co-authors. wouldn't be perfect, but would eliminate a good chunk of low hanging fruit.

api a day ago | parent | prev [-]

Just like with writing, any kind of AI detection is going to be inaccurate to the point of snake oil.

LLM detection in writing is basically today's polygraph test pseudoscience. There was a blog a while ago where someone fed classic literature into one and it was detected as probably AI.

neutrinobro a day ago | parent | next [-]

I'm not sure that is the case in this instance. Certainly general writing is a lot more variable and harder to classify, and on the other extreme certain one-line code changes don't have enough information to say anything. However, a blob with a 500+ line code change and 200+ lines of comments is a dead ringer for some of the current class of LLMs. That isn't to say it this behavior couldn't be obfuscated, but some basic categorization could probably separate the majority of human authored commits vs. AI commits. Heck, you could probably train an AI to detect commit-style just by using pre-2022 code archives and existing known-to-be-AI edits/commits.

zahlman a day ago | parent | prev | next [-]

The heuristics that would be used to "detect AI" here would be things that shouldn't be happening anyway, so false positives wouldn't matter.

perrygeo a day ago | parent | prev | next [-]

It's not just "the code itself looks LLM generated" - it's also LOC/hr by a particular author which suggests vibe coding. You could look at the author's github contributions to identify time periods when the author was generating code at super-human speeds. Combine the two signals and you might get something better than a pseudoscience?

verdverm a day ago | parent | prev [-]

An agent doesn't have to be perfect to be useful. If it can find clear examples of stuff you don't want to see in a (potential) dependency quickly, that will save you time. Give it search tools and some policies, then have it go find things. You then check them out, ask followups.

Agents as a super powered (re)search assistant is underrated.