Remix.run Logo
dllu 3 days ago

You can think of it as: linear regression models only noise in y and not x, whereas ellipse/eigenvector of the PCA models noise in both x and y.

analog31 3 days ago | parent | next [-]

That brings up an interesting issue, which is that many systems do have more noise in y than in x. For instance, time series data from an analog-to-digital converter, where time is based on a crystal oscillator.

jjk166 3 days ago | parent | next [-]

Well yeah, x is specifically the thing you control, y is the thing you don't. For all but the most trivial systems, y will be influenced by something besides x which will be a source of noise no matter how accurately you measure. Noise in x is purely due to setup error. If your x noise was greater than your y noise, you generally wouldn't bother taking the measurement in the first place.

bravura 3 days ago | parent [-]

“ If your x noise was greater than your y noise, you generally wouldn't bother taking the measurement in the first place.”

Why not? You could still do inference in this case.

jjk166 3 days ago | parent [-]

You could, and maybe sometimes you would, but generally you won't. If at all possible, it makes a lot more sense to improve your setup to reduce the x noise, either with a better setup or changing your x to be something you can better control.

GardenLetter27 3 days ago | parent | prev [-]

This fact underlies a lot of causal inference.

randrus 3 days ago | parent [-]

I’m not an SME here and would love to hear more about this.

CGMthrowaway 3 days ago | parent | prev | next [-]

So when fitting a trend, e.g. for data analytics, should we use eigenvector of the PCA instead of linear regression?

stdbrouw 3 days ago | parent [-]

(Generalized) linear models have a straightforward probabilistic interpretation -- E(Y|X) -- which I don't think is true of total least squares. So it's more of an engineering solution to the problem, and in statistics you'd be more likely to go for other methods such as regression calibration to deal with measurement error in the independent variables.

10000truths 3 days ago | parent | prev | next [-]

Is there any way to improve upon the fit if we know that e.g. y is n times as noisy as x? Or more generally, if we know the (approximate) noise distribution for each free variable?

dllu 2 days ago | parent | next [-]

Yeah, you can generally "whiten" the problem by scaling it in each axis until the variance is the same in each dimension. What you describe is if x and y have a covariance matrix of like

    [ σ², 0;
      0,  (nσ)² ]
but whitening also works in general for any arbitrary covariance matrix too.

[1] https://en.wikipedia.org/wiki/Whitening_transformation

defrost 3 days ago | parent | prev [-]

> Or more generally, if we know the (approximate) noise distribution for each free variable?

This was a thing 30 odd years ago in radiometric spectrometry surveying.

The X var was time slot, a sequence of (say) one second observation accumulation windows, the Yn vars were 256 (or 512, etc) sections of the observable ground gamma ray spectrum (many low energy counts from the ground, Uranium, Thorium, Potassium, and associated breakdown daughter products; some high energy counts from the infinite cosmic background that made it through the radiation belts and atmosphere to near surface altitudes)

There was a primary NASVD (Noise Adjusted SVD) algorithm (Simple var adjustment based on expected gamma event distributions by energy levels) and a number of tweaks and variations based on how much other knowledge seemed relevant (broad area geology and radon expression by time of day, etc)

See, eg: Improved NASVD smoothing of airborne gamma-ray spectra Minty / McFadden (1998) - https://connectsci.au/eg/article-abstract/29/4/516/80344/Imp...

scotty79 2 days ago | parent | prev [-]

It might be cool to train neural network by minimizing error with assumption there's noise on both inputs and outputs.