| ▲ | lich_king 4 hours ago | |||||||||||||||||||||||||||||||
I'm always startled about how HN approaches these topics. When we have a press release from a university about how researchers can detect thoughts via fMRI, we have no issue with the claim. But if a vendor makes a pretty believable claim that there are repetitive statistical patterns in LLM output, it's all of sudden treated the same as palm reading. The problem isn't that AI detection doesn't work. State of the art in this field is pretty solid. The only issue is that it's probabilistic, so it sometimes fails, and when it does, we have nothing else in situations where you actually want to know if someone put in the work. So what are you proposing, exactly? That we run a large-scale experiment of "let's see what happens if children don't actually need to learn to do thinking and writing on their own"? The reality is that without some form of compulsion, most kids would rather play video games / scroll through TikTok all day. Or that we move to a vastly more resource-intensive model where every kid is given personalized instruction and watched 1:1? | ||||||||||||||||||||||||||||||||
| ▲ | Zigurd 4 hours ago | parent | next [-] | |||||||||||||||||||||||||||||||
>> But if a vendor makes a pretty believable claim that there are repetitive statistical patterns in LLM output, it's all of sudden treated the same as palm reading. That's what fortunetellers do. The problem isn't guessing correctly about AI content in writing. The problem is false positives. That's what puts it in the same category is predictive policing scam software. And fortunetelling. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | PufPufPuf 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
Eliminating any statistically significant difference between a high-quality human-written text and LLM-written text is exactly what the LLMs are being trained for. At this point, "text is low quality, therefore must be human" is a much stronger signal. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | wongarsu 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
You can detect if texts from a year ago used AI based on statistical patterns. Nobody is taking issue with that. But once you tell people "we will run these tests to detect if your future submissions are using AI" you create an adversarial environment and your statistical methods will continuously break. Not because statistics is broken, but because you are trying to hit a moving target that doesn't want to be hit. That's not like detecting thoughts via fMRI, it's like detecting tomorrows malware with yesterday's malware signatures. Or like researchers making a vaccine against the common cold And the obvious proposal to fix that has been made multiple times in this thread: don't make take-at-home tasks part of the grade. Instead of trying to punish what you can't reliably detect, take away the incentive to do it in the first place | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | armchairhacker 3 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||
> When we have a press release from a university about how researchers can detect thoughts via fMRI, we have no issue with the claim. Different people. I for one have always claimed that fMRI is too coarse-grained for detailed thought detection. If AI detection "sometimes fails", it doesn't "work". It works well enough to convict someone with other evidence, but when there's no other evidence nor an attempt to get any, it has no good use. What I propose is simple: grade only closed-book exams, and hold students' phones during the exams. Students don't need 1:1 monitoring, it's the same as 10-20 years ago. | ||||||||||||||||||||||||||||||||