| ▲ | rf15 4 days ago |
| Honestly, after 20 years in the field: optimising the workflow for when you can already reliably reproduce the bug seems misapplied because that's the part that already takes the least amount of time and effort for most projects. |
|
| ▲ | nixpulvis 4 days ago | parent | next [-] |
| Just because you can reproduce it doesn't mean you know what is causing it. Running a bisect to fix which commit introduces it will reduce the area you need to search for the cause. |
| |
| ▲ | SoftTalker 4 days ago | parent [-] | | I can think of only a couple of cases over 20+ years where I had to bisect the commit history to find a bug. By far the normal case is that I can isolate it to a function or a query or a class pretty quickly. But most of my experience is with projects where I know the code quite well. | | |
| ▲ | cloud8421 4 days ago | parent | next [-] | | I think your last sentence is the key point - the times I've used bisect have been related to code I didn't really know, and where the knowledgeable person was not with the company more or on holiday. | | |
| ▲ | nixpulvis 4 days ago | parent | next [-] | | Exactly. And even if I do know the source pretty well, that doesn't mean I'm caught up on all the new changes coming in. It's often a lot faster to bisect than to read the log over the month or two since I touched something. | |
| ▲ | SoftTalker 4 days ago | parent | prev | next [-] | | Even so, normally anything like a crash or fatal error is going to give you a log message somewhere with a stack dump that will indicate generally where the error happened if not the exact line of code. For more subtle bugs, where there's no hard error but something isn't doing the right thing, yes bisect might be more helpful especially if there is a known old version where the thing works, and somewhere between that and the current version it was broken. | |
| ▲ | hinkley 4 days ago | parent | prev [-] | | Or they were barking up a wrong tree and didn’t know it yet, and the rest of us were doing parallel discovery. Tick tock. You need competence in depth when you have SLIs. |
| |
| ▲ | wyldfire 4 days ago | parent | prev | next [-] | | > By far the normal case is that I can isolate it to a function or a query or a class pretty quickly In general, this takes human-interactive time. Maybe not much, but generally more interactive time than is required to write the bisect test script and invoke `git bisect run ...` The fact that it's noninteractive means that you can do other work in the meantime. Once it's done you might well have more information than you'd have if you had used the same time manually reducing it interactively by trying to reduce the scope of the bug. | |
| ▲ | 4 days ago | parent | prev | next [-] | | [deleted] | |
| ▲ | hinkley 4 days ago | parent | prev [-] | | I’ve needed CPR zero times and bisect around a dozen. You should know both particularly for emergencies. |
|
|
|
| ▲ | hinkley 4 days ago | parent | prev | next [-] |
| I would add to nixpulvis’s comments that git history may also help you find a repro case, especially if you’ve only found a half-assed repro case that is overly broad. Before you find even that, your fire drill strategy is very very important. Is there enough detail in the incident channel and our CD system for coworkers to put their dev sandbox in the same state as production? Is there enough if a clue of what is happening for them to run speculative tests in parallel? Is the data architecture clean enough that your experiments don’t change the outcome of mine? Onboarding docs and deployment process docs, if they are tight, reduce the Amdahl’s Law effect as it applies to figuring out what the bug is and where it is. Which is I. This context also Brooks ‘s Law. |
|
| ▲ | zeroonetwothree 4 days ago | parent | prev [-] |
| Eh not always. If you work in a big codebase with 1000s of devs then it can quite tricky to find the cause of some bug when it’s in some random library someone changed for a different reason. |