Remix.run Logo
pron 2 hours ago

> I would think the cost multiplier in those cases is much lower for an LLM as compared to a human that doesn't have an inherit understanding and needs to give it thought. Wouldn't you?

No. I don't see why proving would require less relative effort for an LLM. In fact, years ago, long before LLMs, I wrote about why it is relatively easy to write sort-of-correct software yet hard to write provably correct software, and I don't see why it's any different for LLMs. Their power lies in inductive "intuition", while deduction requires effort, just as it does for humans: https://pron.github.io/posts/people-dont-write-programs

But there's no need to speculate. Those who think verification-by-LLM is feasible and cost-effective on an industrial scale, are welcome to try it and report what they find. So far I've seen only tiny examples, and even they don't show effortless (i.e. token-light) work by the agent.