alot of software engineering, especially in complex systems, is still just tweaking retries, alarms, edge cases etc. it might take 3 days to even figure out what went wrong