> I've had Claude Code write an entire unit/integration test suite in a few hours (300+ tests) for a fairly complex internal tool. This would take me, or many developers I know and respect, days to write by hand.

I'm not sure about this. The tests I've gotten out in a few hours are the kind I'd approve if another dev sent then but haven't really ended up finding meaningful issues.

▲

martinald 14 hours ago | parent | next [-]

Just to be clear, they weren't stupid 'is 1+1=2' type tests.

I had the agent scan the UX of the app being built, find all the common flows and save them to a markdown file.

I then asked the agent to find edge cases for them and come up with tests for those scenarios. I then set off parallel subagents to develop the the test suite.

It found some really interesting edge cases running them - so even if they never failed again there is value there.

I do realise in hindsight it makes it sound like the tests were just a load of nonsense. I was blown away with how well Claude Code + Opus 4.5 + 6 parallel subagents handled this.

▲

kace91 14 hours ago | parent | prev | next [-]

Have you noticed how it's never "I got this awesome code!"? It's always "I got good code, trust me".

People say their prompts are good, awesome code is being generated, it solved a month's worth of work in a minute. Nobody comes with receipts.

	▲	dboreham 14 hours ago \| parent [-]
		I keep seeing posts like this so I decided to video record all my LLM coding sessions and post them on YouTube. Early days, I only had the idea on Saturday.

▲

Aeolun 14 hours ago | parent | prev [-]

I find I get better tests if I use agents to generate tests.