Remix.run Logo
barrkel 2 hours ago

There's another gap, actually. Imprecise specification.

When you need to implement something yourself, you have to make decisions when faced with the reality of turning ideas into code.

An AI agent sometimes surfaces these; and sometimes it just makes a choice.

The risk is tests just embed these decisions as policy in code, without there have been proper consideration.

Often there's a core ambiguity in a conception somewhere, and because of the limited context of an AI, it can implement things one way and then another for the next feature, without actually hitting the inconsistency.

AndrewKemendo an hour ago | parent [-]

I agree but this is captured in the higher order evaluation layer

So a poor specification is perfectly technically implemented won’t actually do what the intended goal is because the user has not correctly specified the goal and so ultimately it will fail at the highest level task if poorly specified

But that won’t necessarily reveal itself in code it would reveal itself in failure for other people to adopt or the user let’s say to adopt that tool for their workflow