I don't know, I prompted Opus 4.5 "Tell me the reasons why this report is stupid" on one of the example slop reports and it returned a list of pretty good answers.[1]

Give it a presumption of guilt and tell it to make a list, and an LLM can do a pretty good job of judging crap. You could very easily rig up a system to give this "why is it stupid" report and then grade the reports and only let humans see the ones that get better than a B+.

If you give them the right structure I've found LLMs to be much better at judging things than creating them.

Opus' judgement in the end:

"This is a textbook example of someone running a sanitizer, seeing output, and filing a report without understanding what they found."

1. https://claude.ai/share/8c96f19a-cf9b-4537-b663-b1cb771bfe3f

▲

exyi 3 hours ago | parent | next [-]

Ok, run the same prompt on a legitimate bug report. The LLM will pretty much always agree with you

▲

colechristensen 2 hours ago | parent [-]

find me one

	▲	Jach 23 minutes ago \| parent [-]
		https://hackerone.com/curl/hacktivity Add a filter for Report State: Resolved. FWIW I agree with you, you can use LLMs to fight fire with fire. It was easy to see coming, e.g. it's not uncommon in sci-fi to have scenarios where individuals have their own automation to mediate the abuses of other people's automation. I tried your prompt with https://hackerone.com/reports/2187833 by copying the markdown, Claude (free Sonnet 4.5) begins: "I can't accurately characterize this security vulnerability report as "stupid." In fact, this is a well-written, thorough, and legitimate security report that demonstrates: ...". https://claude.ai/share/34c1e737-ec56-4eb2-ae12-987566dc31d1 AI sycophancy and over-agreement are annoying but people who just parrot those as immutable problems or impossible hurdles must just never try things out.

▲

3 hours ago | parent | prev | next [-]

[deleted]

▲

imiric 3 hours ago | parent | prev | next [-]

"Tell me the reasons why this report is stupid" is a loaded prompt. The tool will generate whatever output pattern matches it, including hallucinating it. You can get wildly different output if you prompt it "Tell me the reasons why this report is great".

It's the same as if you searched the web for a specific conclusion. You will get matches for it regardless of how insane it is, leading you to believe it is correct. LLMs take this to another level, since they can generate patterns not previously found in their training data, and the output seems credible on the surface.

Trusting the output of an LLM to determine the veracity of a piece of text is a baffilingly bad idea.

	▲	colechristensen 3 hours ago \| parent [-]
		>"Tell me the reasons why this report is stupid" is a loaded prompt. This is precisely the point. The LLM has to overcome its agreeableness to reject the implied premise that the report is stupid. It does do this but it takes a lot, but it will eventually tell you "no actually this report is pretty good" The point being filtering out slop, we can be perfectly find with false rejections. The process would look like "look at all the reports, generate a list of why each of them is stupid, and then give me a list of the ten most worthy of human attention" and it would do it and do a half-decent job at it. It could also pre-populate judgments to make the reviewer's life easier so they could very quickly glance at it to decide if it's worthy of a deeper look.

▲

nprateem 3 hours ago | parent | prev [-]

And if you ask why it's accurate it'll spaff out another list of pretty convincing answers.

	▲	colechristensen 3 hours ago \| parent [-]
		It does indeed, but at the end added: >However, I should note: without access to the actual crash file, the specific curl version, or ability to reproduce the issue, I cannot verify this is a valid vulnerability versus expected behavior (some tools intentionally skip cleanup on exit for performance). The 2-byte leak is also very small, which could indicate this is a minor edge case or even intended behavior in certain code paths. Even biased towards positivity it's still giving me the correct answer. Given a neutral "judge this report" prompt we get "This is a low-severity, non-security issue being reported as if it were a security vulnerability." with a lot more detail as to why So positive, neutral, or negative biased prompts all result in the correct answer that this report is bogus.