| ▲ | xmddmx 7 days ago | ||||||||||||||||||||||||||||||||||
The concept you need here is "Statistical Power". The ELI5 version is that there are two mistakes you can make when looking at a P value: Type I error, where your P value is falsely low. In the experiment being discussed here, it would lead one to conclude that AI code is worse. Otherwise known as a false positive. Type II error, where your P value is falsely high, leading you to conclude that AI code is no different. Otherwise known as a false negative. https://en.wikipedia.org/wiki/Power_(statistics) One can calculate statistical power for a given experimental protocol. My hunch is that if you did this, you would find this experiment is grossly under-powered. This means you can't make the "absence of evidence" claim. | |||||||||||||||||||||||||||||||||||
| ▲ | davrosthedalek 7 days ago | parent [-] | ||||||||||||||||||||||||||||||||||
He can't make the evidence of absence claim, but he can absolutely make the absence of evidence claim. | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||