Remix.run Logo
srean a day ago

By using the (estimated) Radon Nikodym derivative between the the two measures -- the measure from which the labelers samples and the deployed to measure from which the on-deployment items are presumably sampled.

For this to work the two measures need to be absolutely continuous with each other.

This is close to your pre-penultimate paragraph and that's mathy enough. This done right can take care of bias but may do so at the expense of variance, so this Radon Nikodym derivative that is estimated needs to be done so under appropriate regularization in the function space.

Thinking of the solution in these terms requires mathematical thinking.

Now let's consider the case where some features may be missing on instances at the time of deployment but always present in training and the features are uncorrelated with each other (by construction).