Remix.run Logo
datadrivenangel 2 days ago

How do you control for confounders and small data?

For data size, if you're a medium-ish company, you may only hire a few engineers a year (1000 person company, 5% SWE staff, 20% turnover annually = 10 new engineers hired per year), so the numbers will be small and a correlation will be potentially weak/noisy.

For confounders, a bad manager or atypical context may cause a great engineer to 'perform' poorly and leave early. Human factors are big.

jakubmazanec 2 days ago | parent [-]

Sure, psychological research is hard because of this, but that's not what I'm proposing - I'm talking about just having some data on predictive validity of the hiring process. If there's some coding test: is it reliable and valid? Aren't some items redundant because they're too easy or too hard? Which items have the best discrimination parameter? How the total scores correlate with e.g. length of the test takers tenures?

Sure, the confidence intervals will be wide, but it doesn't matter, even noisy data are better than no data.

Maybe some companies already do this, but I didn't see it (though my sample is small).