| ▲ | mhitza 7 hours ago | |
3. What, or who, is the judge of correctness (accuracy); regardless of the many solutions run in parallel. If I optimize for max accuracy how close can I get to 100% matemathically and how much would that cost? | ||
| ▲ | kaicianflone 2 hours ago | parent | next [-] | |
I’m working on an open source project that treats this as a consensus problem instead of a single model accuracy problem. You define a policy (majority, weighted vote, quorum), set the confidence level you want, and run enough independent inferences to reach it. Cost is visible because reliability just becomes a function of compute. The question shifts from “is this output correct?” to “how much certainty do we need, and what are we willing to pay for it?” Still early, but the goal is to make accuracy and cost explicit and tunable. | ||
| ▲ | mapace22 4 hours ago | parent | prev [-] | |
Hi there, To be fair, achieving 100% accuracy is something even humans don't do. I don't think this is about a system just asking an AI if something is right or wrong. The "judge" isn't another AI flipping a coin, it’s a code validator based on mathematical forms or pre established rules. For example, if the agent makes a money transfer, the judge enters the database and validates that the number is exact. This is where we are merging AI intelligence with the security of traditional, "old school" code. Getting this close to 100% accuracy is already a huge deal. It’s like having three people reviewing an invoice instead of just one, it makes it much harder for an error to occur. Regarding the cost, sure, the AI might cost a bit more because of all these extra validations. But if spending one dolar in tokens saves a company from losing five hundred dollar, due to an accounting error, the system has already paid for itself. It’s an investment, not a cost. Plus, this tighter level of control helps prevent not just errors, but also internal fraud and external irregularities. It’s a layer of oversight that pays off. Best regards | ||