| ▲ | nwallin a day ago | |
"Anyone, from the most clueless amateur to the best cryptographer, can create an algorithm that he himself can’t break."--Bruce Schneier There's a corollary here with LLMs, but I'm not pithy enough to phrase it well. Anyone can create something using LLMs that they, themselves, aren't skilled enough to spot the LLMs' hallucinations. Or something. LLMs are incredibly good at exploiting peoples' confirmation biases. If it "thinks" it knows what you believe/want, it will tell you what you believe/want. There does not exist a way to interface with LLMs that will not ultimately end in the LLM telling you exactly what you want to hear. Using an LLM in your process necessarily results in being told that you're right, even when you're wrong. Using an LLM necessarily results in it reinforcing all of your prior beliefs, regardless of whether those prior beliefs are correct. To an LLM, all hypotheses are true, it's just a matter of hallucinating enough evidence to satisfy the users' skepticism. I do not believe there exists a way to safely use LLMs in scientific processes. Period. If my belief is true, and ChatGPT has told me it's true, then yes, AI, the tool, is the problem, not the human using the tool. | ||
| ▲ | czl 10 hours ago | parent [-] | |
> I do not believe there exists a way to safely use LLMs in scientific processes. What about giving the LLM a narrowly scoped role as a hostile reviewer, while your job is to strengthen the write-up to address any valid objections it raises, plus any hallucinations or confusions it introduces? That’s similar to fuzz testing software to see what breaks or where the reasoning crashes. Used this way, the model isn’t a source of truth or a decision-maker. It’s a stress test for your argument and your clarity. Obviously it shouldn’t be the only check you do, but it can still be a useful tool in the broader validation process. | ||