I wouldn't call it entirely defeated, it got maybe 90% of the way there. Before LLMs you couldn't get 50% of the way there in an automated way.

> What he produces

I feel like personifying LLMs more than they currently are is a mistake people make (though humans always do this), they're not entities, they don't know anything. If you treat them too human you might eventually fool yourself a little too much.

▲

thecr0w a day ago | parent | next [-]

As a couple other comments pointed out, it's also not fair to judge Claude based on a one shot like this. I sort of assume these limitations will remain even if we went back and forth but to be fair, I didn't try that more than a few times in this investigation. Maybe on try three it totally nails it.

	▲	micromacrofoot 8 hours ago \| parent [-]
		Very true, I would also caution this with test projects with real humans in the hiring process. Comparing one-shots from actual people is unfair too, and often the most valid assessment comes with giving them feedback and seeing how they respond to it. Aside from that point: if you are reading this and making people do a project as part of the hiring process, you should absolutely be paying them for their time (even a token amount).

▲

a day ago | parent | prev [-]

[deleted]