With one term it gets more robust in the face of excluding endpoints when constructing the jackknife train/test split, I think. But you're right, it does sound fishy.

▲

fluidcruft 15 hours ago | parent [-]

What the post is describing is just ANOVA. If removing a category improves the overall fit then fitting the two terms independently has the same optimal solution (with the two independent terms found to be identical). MSE never increases when adding a category.

This is why you have to reach to things that penalize adding parameters to models when running model comparisons.

▲

kqr 13 hours ago | parent [-]

No, the post is doing cross-validation to test predictive power directly. The error will not decompose as neatly then.

▲

fluidcruft 8 hours ago | parent [-]

Why would they do that and where do you see evidence they did?

	▲	kqr 6 hours ago \| parent [-]
		Because it's a direct way to measure predictive power, and it says so: "We’ll use leave-one-out cross-validation"