Remix.run Logo
ethernot 5 hours ago

I'm not sure that it's possible to produce anything reasonable in that space. It would need to know how far it is away from correct to provide a usable confidence value otherwise it'd just be hallucinating a number in the same way as the result.

An analogy. Take a former commuter friend of mine, Mr Skol (named after his favourite breakfast drink). Seen on a minibus I had to get to work years ago, we shared many interesting conversations. Now he was a confident expert on everything. If asked to rate his confidence in a subject it would be a good 95% at least. However he spoke absolute garbage because his brain was rotten away from drinking Skol for breakfast, and the odd crack chaser. I suspect his model was still better than GPT-4o. But an average person could determine the veracity of his arguments.

Thus confidence should be externally rated as an entity with knowledge cannot necessarily rate itself for it has bias. Which then brings in the question of how do you do that. Well you'd have to do the research you were going to do anyway and compare. So now you've used the AI and done the research which you would have had to do if the AI didn't exist. So the AI at this point becomes a cost over benefit if you need something with any level of confidence and accuracy.

Thus the value is zero unless you need crap information, which is at least here, never, unless I'm generating a picture of a goat driving a train or something. And I'm not sure that has any commercial value. But it's fun at least.

readyplayernull 4 hours ago | parent [-]

Do androids dream of Dunning-Kruger?