Remix.run Logo
rustyhancock a day ago

Important though we generally assume few bad actors.

But like the XZ attack, we kind of have to assume that advanced perissitant threats are a reality for FOSS too.

I can envisage a Sybil attack where several seemingly disaparate contributors are actually one actor building a backdoor.

Right now we have a disparity in that many contributors can use LLMs but the recieving projects aren't able to review them as effectively with LLMs.

LLM generated content often (perhaps by definition) seems acceptable to LLMs. This is the critical issue.

If we had means of effectively assessing PRs objectively that would make this moot.

I wonder if those is a whole new class of issue. Is judging a PR harder than making one? It seems so right now

nokcha 2 hours ago | parent | next [-]

> LLM generated content often (perhaps by definition) seems acceptable to LLMs.

In my experience (albeit with non-coding questions), ChatGPT 5.2 is often quite eager to critique snippets of its own replies from previous conversations. And reasoning models can definitely find flaws in LLM-written code.

vladms a day ago | parent | prev [-]

> Is judging a PR harder than making one?

Depends on the assumptions. If you assume good intent of the submitter and you spend time to explain what he should improve, why something is not good, etc, than it's a lot of effort. If you assume bad intent, you can just reject with something like "too large review from unproven user, please contribute something smaller first".

Yes, we might need to take things a bit slower, and build relations to the people you collaborate with in order to have some trust (this can also be attacked, but this was already possible).

PowerfulWizard a day ago | parent [-]

On judging vs. making, also someone has to take time away from development to do code review. If the code being reviewed is written by someone who is involved and interested then at least there's a benefit to training and consensus building in discussing the code and the project in the review phase. The time and energy of developers who are qualified to review is quite possibly the bottleneck on development speed too so wasting review time will slow down development.

For AI generated code if previous PRs aren't loaded into context then there's no lasting benefit from the time taken to review and it's blank slate each time. I think ultimately it can be solved with workflow changes (i.e. AI written code should be attributed to the AI in VCS, the full trace and manual edits should be visible for review, all human input prompts to the AI should be browsable during review without having scroll 10k lines of AI reasoning.)