> In a blind evaluation of nearly 3,000 anonymized comparisons, professors rated AI responses significantly higher than answers written by other professors, with AI winning 75% of head-to-head matchups.

75% win rate seems pretty good!

Paper link: https://law.stanford.edu/wp-content/uploads/2026/06/salinas_...

▲

causal 6 hours ago | parent | next [-]

I wonder to what degree the AI was just better at communicating. My experience with attorneys is that they are often some of the worst writers.

	▲	applicative 5 hours ago \| parent [-]
		The writing is always fluid and grammatically flawless. This carries much more weight with us than we believe. I know the illusion well from decades of grading college papers. Many of the highest quality students use English as a second language, and I know this, but an American well trained in writing, grammar, spelling always gives an impression of superiority. (Being well trained in writing, grammar, spelling etc is of course high merit, which is how the illusion forms - it is basically an illusion of global 'intelligence')

▲

falcor84 6 hours ago | parent | prev | next [-]

Yeah, 75% win rate is a ~200 points Elo difference, which is quite massive.

▲

jshier 6 hours ago | parent | prev [-]

I do wish they'd used some more objective criteria. Simply being preferable one of the things LLMs have trained for since the beginning, hence its sycophantic nature.

▲

wilg 6 hours ago | parent [-]

What criteria would you use for judging legal arguments?

▲

mitkebes 6 hours ago | parent | next [-]

The arguments need to be based on actual law, and any cited reference cases need to be real.

There's been a lot of news stories about lawyers using AI, and then getting in trouble for citing hallucinated laws or cases. It doesn't matter if the AI response is "preferred" over the human one if it gets thrown out when put under the scrutiny of a real case.

▲

wilg 6 hours ago | parent [-]

Who's gonna determine that? A bunch of law professors?

	▲	voxl 5 hours ago \| parent [-]
		But did they? Or did they just go off what answer felt better? Did they put in any work to actually confirm the answer? Or did the busy law professors just click through and move on with their life?

▲

mylifeandtimes 6 hours ago | parent | prev [-]

maybe seeing if the case law it cited was real or imagined? Just one idea, IANAL

	▲	gamerDude 6 hours ago \| parent [-]
		Well, they had the data around if the answer would be harmful to the students learning. AI was scored at 3.5% harmful answers and 12% of law professor answers were considered harmful.