| ▲ | elevation 3 hours ago | |
> live patching should become part of the linux kernel Services where uptime matters tend to be designed so they can tolerate the reboot of a single node for other reasons besides kernel maintenance. I can't imagine a situation where I can't tolerate the downtime of a reboot but I would be willing to risk the system locking up with brain surgery gone wrong. | ||
| ▲ | toast0 2 hours ago | parent | next [-] | |
> Services where uptime matters tend to be designed so they can tolerate the reboot of a single node for other reasons besides kernel maintenance. I can't imagine a situation where I can't tolerate the downtime of a reboot but I would be willing to risk the system locking up with brain surgery gone wrong. I've run systems with live code updates for userland, and would have considered live kernel updates if it was reasonable on our systems. The thing is you typically build your system to tolerate reboot or unscheduled stop of a single node. Scheduled stop is nicer, but systems sometimes lock up even when you're not doing risky behaviors, so you know. But just because the system can tolerate a reboot or restart doesn't mean it's not disruptive. A lock up / etc during hot load is also disruptive, of course. But when you can push code without having to stop anything, with limited impact on users, it makes it easier and faster to do updates. You can use whatever rollout pattern you like to contain risk too; same as you would for an upgrade with restarts. For us, we have servers with hundreds of thousands or millions of tcp connections from mobile clients. Restarting a server would make all those clients have to reconnect and connecting is expensive. Restarting all the servers would result in many clients reconnecting several times. It was better to avoid it when possible. | ||
| ▲ | PunchyHamster 2 hours ago | parent | prev [-] | |
We've used ksplice for good amount of years (till bought by Oracle and they stopped publishing patches for other Linux distros) and all in all it was very stable technology 10 years ago > I can't imagine a situation where I can't tolerate the downtime of a reboot but I would be willing to risk the system locking up with brain surgery gone wrong. Because you haven't worked at that level in organization. Doing restart in some case might involve paperwork with your client and maintenance window outside of working hours even if service is redundant. And some customers are fine with a little bit of downtime and don't want active/active level of redundancy but still insist of maintenance windows for any work like that. Live patching makes that a whole lot easier | ||