Remix.run Logo
RantyDave 4 hours ago

"a thousand-day uptime shouldn’t be folklore"

I reboot a lot. Mostly I want to know that should the system need to reboot for whatever reason, that it will all come back up again. I run a very lightly loaded site and I highly doubt anybody notices the minute (or so) loss of service caused by rebooting.

Pretty sure I don't feel bad about this.

klempner 2 hours ago | parent | next [-]

There's a weird fetishization of long uptimes. I suspect some of this dates from the bad old days when Windows would outright crash after 50 days of uptime.

In the modern era, a lightly (or at least stably) loaded system lasting for hundreds or even thousands of days without crashing or needing a reboot should be a baseline unremarkable expectation -- but that implies that you don't need security updates, which means the system needs to not be exposed to the internet.

On the other hand, every time you do a software update you put the system in a weird spot that is potentially subtly different from where it would be on a fresh reboot, unless you restart all of userspace (at which point you might as well just reboot).

And of course FreeBSD hasn't implemented kernel live patching -- but then, that isn't a "long uptime" solution anyway, the point of live patching is to keep the system running safely until your next maintenance window.

cesarb 2 hours ago | parent | next [-]

> There's a weird fetishization of long uptimes. I suspect some of this dates from the bad old days when Windows would outright crash after 50 days of uptime.

My recollection is that, usually, it crashed more often than that. The 50 days thing was IIRC only the time for it to be guaranteed to crash (due to some counter overflowing).

> In the modern era, a lightly (or at least stably) loaded system lasting for hundreds or even thousands of days without crashing or needing a reboot should be a baseline unremarkable expectation -- but that implies that you don't need security updates, which means the system needs to not be exposed to the internet.

Or that the part of the system which needs the security updates not be exposed to the Internet. Other than the TCP/IP stack, most of the kernel is not directly accessible from outside the system.

> On the other hand, every time you do a software update you put the system in a weird spot that is potentially subtly different from where it would be on a fresh reboot, unless you restart all of userspace (at which point you might as well just reboot).

You don't need a software update for that. Normal use of the system is enough to make it gradually diverge from its "clean" after-boot state. For instance, if you empty /tmp on boot, any temporary file is already a subtle difference from how it would be on a fresh reboot.

Personally, I consider having to reboot due to a security fix, or even a stability fix, to be a failure. It means that, while the system didn't fail (crash or be compromised), it was vulnerable to failure (crashing or being compromised). We should aim to do better than that.

fragmede an hour ago | parent | prev [-]

It's from a bygone era. An era when you'd lose hours of work if you didn't go file -> save, (or ctrl-s, if you were obsessive). If you reboot, you lose all of your work, your configuration, that you haven't saved to disk. Computers were scarce, back in those days. There was one in the house, in the den, for the family. These days, I've got a dozen of them and everything autosaves. But so that's where that comes from.

arthurfirst 4 hours ago | parent | prev [-]

Regularly (every 3 years or so) had 1000+ days of uptime with FreeBSD rack servers on Supermicro mobos.

I built the servers myself and then shipped to colo half way around the world.

I got over 1400 once and then I needed to add a new disk. They ran for almost 13 years with some disk replacements, CPU upgrades, and memory additions

kiwijamo 3 hours ago | parent [-]

Genuine questions:

Do you ever apply kernel patches? I also run FreeBSD and reboot for any kernel patches and never can get my uptimes to 1,000 days before that.

Do you just run versions that don't get security patches? Security support EOL dates generally means I need to upgrade before 1,000 days too. For example the current stable release gets security patches only from June 10, 2025 to June 30, 2026 giving just over 360 days of active support.

I get FreeBSD is stable and get days of uptime, and I could easily do the same if I didn't bother upgrading etc, it's just that I can't see how that's done without putting your machine at risk. Perhaps only for airgapped machines?

toast0 an hour ago | parent [-]

> Do you ever apply kernel patches?

For my personal machines, I just run GENERIC kernels and that includes a lot, so I need to do a lot of updates. I also reboot every time I update the OS (even when it's an update that doesn't touch the kernel) so that I'm sure reboots will be fine... but I did setup my firewalls with carp and pfsync so I can reboot my firewall machines one at a time with minimal disruption.

For work machines, I use a crafted kernel config that only includes stuff we use, although so far I've usually had one config for all boxes, because it's simpler. If there's a security update for part of the kernel that we don't use, we don't need to update. Security update in a driver we don't have, no update; security update in tcp, probably update. Some security updates include mitigation steps to consider instead of or until updating... Sometimes those are reasonable too. Sometimes you do need to upgrade the whole fleet.

When there's an update that's useful but also has an effective mitigation, I would mitigate on all machines, run the update and reboot test on one or a few machines, and then typically install on the rest of the machines and if they reboot at some point, great, they don't need the mitigation anymore. If they are retired with 1000 days of uptime and a pending kernel update, that's fine too.

I would not update a machine just because support for the minor release it was on timed out. Only if there was an actual need. Including a security issue found in a later release that probably affects the older one or at least can't be ruled out. Yes, there's a risk of unpublished security issues in unsupported releases; but a supported release also has a risk of unpublished security issues too.