I've been working on a luks+btrfs+systemd tool (for managing an encrypted raid1 pool). While I have worked with each individually, it's not obvious what kind of cases you have to handle when composing them together. A lot of it is simply emergent, and the status quo has been to do your best and then see what actually happens at runtime.

Documentation is helpful to describe high-level intentions, but the beauty is when you have access to source code. Now a good model can derive behavior from implementation instead of docs which are inherently limited.

I implemented the luks+btrfs part by hand a few years ago, and I resurrected the project a couple months ago. Using source code for local reference, Claude discovered so many major cases I missed, especially in the unhappy-path scenarios. Even in my own hand-written tests. And it helped me set up an amazing NixOS VM test system include reproduction tests on the libraries to see what they do in weird undocumented cases.

So I think "tasks beyond our intellect (and/or time and energy)" can be fitting. Otherwise I'd only be capable of polishing this project if luks+btfs+systemd were specifically my day job. I just can't fit so much in my head and working memory.

▲

zekica 7 hours ago | parent [-]

And it can fail in great ways. Last example: I asked claude for a non-trivial backup and recovery script using restic. I gave it the whole restic repo and it still made up parameters that don't exist in the code (but exist in a pull request that's been sitting not merged for 10+ months).

	▲	hombre_fatal 7 hours ago \| parent [-]
		Interesting. I don't think I've seen hallucinations at that level when it's referencing source code. Though my workflow always starts in plan mode where Claude is clearly more thorough (which is the reason it takes 10x as long as going straight to impl). I rarely skip it.