Remix.run Logo
tsunamifury 5 hours ago

Ground truth is not consensus, it has to be graded against what actually works for the original goal. Plenty of scenarios with AI and Humans can result in consensus around incorrectness.

adamtaylor_13 5 hours ago | parent [-]

While pedantically correct, I think the comment above assumed that you've correctly specified the work. If you can't correctly specify your work, AI agents are just going to help you get a non-solution faster.

tsunamifury 5 hours ago | parent [-]

Isn't coding the act of specificying the work to a processor? And AI agents are supposed to bridge the gap with intelligence from less specificed to more specified or possibly even more intelligent and alternate implementations?

wrs 2 hours ago | parent | next [-]

What I meant by "ground truth" is that it is not fuzzy, not AI-evaluated, and not a consensus. The test suite passes or it doesn't. The codebase lints or it doesn't. The performance improved or it didn't.

An agent can help you create the specification, but it's up to you to know whether it's correctly testing that you got the result you wanted.

adamtaylor_13 5 hours ago | parent | prev [-]

Yep. And yet, there's still some level of specification you have to do.