Remix.run Logo
tsunamifury 21 hours ago

Ground truth is not consensus, it has to be graded against what actually works for the original goal. Plenty of scenarios with AI and Humans can result in consensus around incorrectness.

adamtaylor_13 21 hours ago | parent [-]

While pedantically correct, I think the comment above assumed that you've correctly specified the work. If you can't correctly specify your work, AI agents are just going to help you get a non-solution faster.

tsunamifury 21 hours ago | parent [-]

Isn't coding the act of specificying the work to a processor? And AI agents are supposed to bridge the gap with intelligence from less specificed to more specified or possibly even more intelligent and alternate implementations?

wrs 17 hours ago | parent | next [-]

What I meant by "ground truth" is that it is not fuzzy, not AI-evaluated, and not a consensus. The test suite passes or it doesn't. The codebase lints or it doesn't. The performance improved or it didn't.

An agent can help you create the specification, but it's up to you to know whether it's correctly testing that you got the result you wanted.

adamtaylor_13 20 hours ago | parent | prev [-]

Yep. And yet, there's still some level of specification you have to do.