▲ | sheepscreek 2 days ago | |
This neatly sums up my experience with Claude Code. It’s a brilliant tool - but one that often requires a tight leash. The challenge is that you won’t know when or where until you’ve used it extensively for your specific use case. In the author’s case, producing a formal proof that inspires little confidence feels counterproductive. It’s very likely that we’re missing a key piece here: perhaps what’s needed is a model trained specifically to keep large language models on task, verify their output, and challenge them on our behalf. That said, deep domain knowledge remains essential. Without it, things can easily get built in odd or unintended ways. A practical workaround (for typical app development/not formal verification) may be to treat the system as a black box - relying on carefully written specifications or test cases. In many situations, that approach could be more effective than trying to wrestle certainty out of inherently uncertain processes. |