Remix.run Logo
latenightcoding 3 hours ago

I never hijack non-ai threads to talk about AI, but can anybody share their experience using LLMs to code in Coq, Lean, etc.

lambdas 2 hours ago | parent [-]

I’ve never used them first hand, but crackpots sure do love claiming to solve Riemann hypothesis, P vs NP, Collatz conjecture etc and then peddle out some huge slop. My experience has solely been curiously following what the LLM’s have been generating.

You have to be very, VERY careful. With how predisposed they are to helping, they’ll turn to “dishonesty” rather than just shut down and refuse. What I tend to see is they get backed into a corner, and they’ll do something like prove something different under the guise of another:

They’ll create long pattern matching chains as to create labyrinths of state machines.

They’ll keep naming functions, values and comments to seem plausible, but you have to follow these to make sure they are what they say. A sneaky little trick is to drop important parameters in functions, they appear in the call but not in the actual body.

They’ll do something like taking a Complex value, but only working with the real projection, rounding a number, creatively making negatives not appear by abs etc etc

So even when it compiles, you’ve got the burden of verifying everything is above board which is a pretty huge task.

And when it doesn’t work, introducing an error or two in formal proof systems often means you’re getting exponentially further away from solving your problem.

I’ve not seen a convincing use that tactics or goals in the proof assistant themselves don’t already provide