Remix.run Logo
nehal3m 3 days ago

In your analogy that calculator would only produce a correct answer 80% of the time, and plausible looking but incorrect ones the other 20%.

If that were the case I’d hire pen guy.

adriancooney 3 days ago | parent | next [-]

What's the error rate of the pen guy?

Also, if your AI has a 20% error rate, you're not holding it right. You need to spend more time keeping it on rails - unit tests, integration tests, e2e tests, local dev + browser use, preview deployments, staging environments, phased rollouts, AI PR reviews, rolling releases. The error rate will be much closer to 0%.

davebren 3 days ago | parent [-]

How does a phased rollout improve LLM error rates exactly?

adriancooney 3 days ago | parent [-]

Error rate here is the rate of shipping bugs to customers.

davebren 3 days ago | parent [-]

That wasn't what the comment you responded to was referring to. I guess it makes sense since you are kind of like an LLM with how you respond to input.

braebo 3 days ago | parent | prev | next [-]

More like “Producing 80% of the correct answer” and the remaining 20% with some nudging and tweaking. Still extremely valuable.

3 days ago | parent | prev [-]
[deleted]