Remix.run Logo
hauntsaninja 4 days ago

git bisect works great for tracking down regressions, but relies on the bug presenting deterministically. But what if the bug is non-deterministic? Or worse, your behaviour was always non-deterministic, but something has changed, e.g. your tests went from somewhat flaky to very flaky.

In addition to the repo linked in the title, I also wrote up a little bit of the math behind it here: https://hauntsaninja.github.io/git_bayesect.html

ajb 10 hours ago | parent | next [-]

Nice! I implemented a similar thing a while back: https://github.com/Ealdwulf/BBChop

I'm going to have to check out how you got linear time with Shannon entropy, because I used Renyi entropy to do that, to make the algebra easier.

It's also possible to do it over the DAG, rather than a linear history - although that makes the code a lot more complicated. Unfortunately there doesn't seem to be a linear time cumulative sum algorithm over dags, so it's super linear in some cases.

belden 5 hours ago | parent | prev | next [-]

My team doesn’t always have cleanly bisectable branches being merged to main —- it’s not uncommon to see “fix syntax error” types of commits.

But, to merge we need to have all tests pass. (If tests flakily pass then we get new flakey tests, yay!)

I know git-bisect doesn’t support this: but could git-bayesect have an option to only consider merge commits? Being able to track a flake change back to an individual PR would be really useful.

sluongng 4 hours ago | parent [-]

You can run bisect with first-parent

hauntsaninja 3 hours ago | parent [-]

That sounds right. `git_bayesect` currently uses `--first-parent`, so I think belden's use case should work, but I haven't tested it much on complicated git histories.

Myrmornis 12 hours ago | parent | prev [-]

This is really cool! Is there an alternative way of thinking about it involving a hidden markov model, looking for a change in value of an unknown latent P(fail)? Or does your approach end up being similar to whatever the appropriate Bayesian approach to the HMM would be?

tazsat0512 an hour ago | parent [-]

The HMM framing connects to change-point detection. CUSUM (Cumulative Sum) charts solve a related problem: detecting when a process parameter has shifted by accumulating deviations from an expected value.

Key difference: CUSUM assumes sequential observation and asks "when did the distribution shift?" Bayesect asks "which commit should I test next?" — active learning vs passive monitoring.

But they could complement each other. If you already have CI pass/fail history, CUSUM on that data gives you a rough change-point estimate for free (no extra test runs), then bayesect refines it with active sampling.