| ▲ | fluidcruft 15 hours ago |
| Yeah I'm not buying the last bit about lower MSE with one term in the model vs two (Brier with one outcome category is MSE of the probabilities). That's the sort of thing that would make me go dig to find where I fucked up the calculation. |
|
| ▲ | kqr 15 hours ago | parent [-] |
| With one term it gets more robust in the face of excluding endpoints when constructing the jackknife train/test split, I think. But you're right, it does sound fishy. |
| |
| ▲ | fluidcruft 15 hours ago | parent [-] | | What the post is describing is just ANOVA. If removing a category improves the overall fit then fitting the two terms independently has the same optimal solution (with the two independent terms found to be identical). MSE never increases when adding a category. This is why you have to reach to things that penalize adding parameters to models when running model comparisons. | | |
| ▲ | kqr 13 hours ago | parent [-] | | No, the post is doing cross-validation to test predictive power directly. The error will not decompose as neatly then. | | |
| ▲ | fluidcruft 8 hours ago | parent [-] | | Why would they do that and where do you see evidence they did? | | |
| ▲ | kqr 6 hours ago | parent [-] | | Because it's a direct way to measure predictive power, and it says so: "We’ll use leave-one-out cross-validation" |
|
|
|
|