▲ | datastoat 6 hours ago | |
Author: "5% chance of shipping something that only looked good by chance". One philosophy of statistics says that the product either is better or isn't better, and that it's meaningless to attach a probability to facts, which the author seems to be doing with the phrase "5% chance of shipping something". Parent: "5% chance of looking as good as it did, if it were truly no better than the alternative." This accepts the premise that the product quality is a fact, and only uses probability to describe the (noisy / probabilistic) measurements, i.e. "5% chance of looking as good". Parent is right to pick up on this, if we're talking about a single product (or, in medicine, if we're talking about a single study evaluating a new treatment). But if we're talking about a workflow for evaluating many products, and we're prepared to consider a probability model that says some products are better than the alternative and others aren't, then the author's version is reasonable. | ||
▲ | pkhuong 5 hours ago | parent | next [-] | |
One easy slip-up with discussing p values in the context of a workflow or a decision-making process is that a process with p < 0.05 doesn't give us any bound on the actual ratio of actually good VS lucky changes. If we only consider good changes, the fraction of false positive changes is 0%; if we only consider bad changes, that fraction is 100%. Hypothesis testing is no replacement for insight or taste. | ||
▲ | kgwgk 4 hours ago | parent | prev | next [-] | |
> But if we're talking about a workflow for evaluating many products, and we're prepared to consider a probability model that says some products are better than the alternative and others aren't, then the author's version is reasonable. It’s not reasonable unless there is a real difference between those “many products” which is large enough to be sure that it would rarely be missed. That’s a quite strong assumption. | ||
▲ | 5 hours ago | parent | prev [-] | |
[deleted] |