Remix.run Logo
logancbrown 2 hours ago

Would this realistically be a problem for code going through LLM-based code-review? Presumably if a LLM reviewer agent hits this commentary, it would produce a failure to analyze and exit, thus failing the automated code review and forcing a human to read through it which they would subsequentially catch and revoke.

dwa3592 2 hours ago | parent | next [-]

or if they are a lazy human - they'd think this model is too strict, let's just review with haiku so that i can tell my manager "it's done". haiku might catch things or not.

i'd say it's an okay attempt from the malwares' creator side. but it can be caught easily with a prompt change.

ofjcihen 2 hours ago | parent | prev | next [-]

In a well-architected design yeah.

Then again those feel rare from where I sit on the security side.

dyauspitr 43 minutes ago | parent | prev [-]

Wouldn’t it just complete the code review having silently fallen back to opus 4.8 thus letting through cleverly written malicious code that fable would have caught but opus wouldn’t?