Remix.run Logo
kijin 4 days ago

> 6. Disabling swap doesn't prevent pathological behaviour at near-OOM, although it's true that having swap may prolong it. Whether the global OOM killer is invoked with or without swap, or was invoked sooner or later, the result is the same: you are left with a system in an unpredictable state. Having no swap doesn't avoid this.

This is the most important reason I try to avoid having a large swap. The duration of pathological behavior at near-OOM is proportional to the amount of swap you have. The sooner your program is killed, the sooner your monitoring system can detect it ("Connection refused" is much more clear cut than random latency spikes) and reboot/reprovision the faulty server. We no longer live in a world where we need to keep a particular server online at all cost. When you have an army of servers, a dead server is preferable to a misbehaving server.

OP tries to argue that a long period of thrashing will give you an opportunity for more visibility and controlled intervention. This does not match my experience. It takes ages even to log in to a machine that is thrashing hard, let alone run any serious commands on it. The sooner you just let it crash, the sooner you can restore the system to a working state and inspect the logs in a more comfortable environment.

mickeyp 4 days ago | parent | next [-]

That assumes the OOM killer kills the right thing. It may well choose to kill something ancillary, which causes your OOM program to just hang or misbehave wildly.

The real danger in all of this, swap or no, is the shitty OOMKiller in Linux.

kijin 4 days ago | parent | next [-]

The OOM killer will be just as shitty whether you have swap or not. But the more swap you have, the longer your program will be allowed to misbehave. I prefer a quick and painless death.

xdfgh1112 4 days ago | parent | prev | next [-]

You can apply memory quotas to the individual processes with cgroups. You can also adjust how likely a process is to be killed.

man8alexd 4 days ago | parent | prev [-]

Nowadays, the OOM killer always chooses the largest process in the system/cgroup by default.

bawolff 4 days ago | parent | prev | next [-]

> OP tries to argue that a long period of thrashing will give you an opportunity for more visibility and controlled intervention.

I didn't get that impression. My read was that OP was arguing for user-space process killers so the system doesn't get to the point where the system becomes unresponsive due to thrashing.

kijin 3 days ago | parent [-]

From the article:

> With swap: ... We have more visibility into the instigators of memory pressure and can act on them more reasonably, and can perform a controlled intervention.

But of course if you're doing this kind of monitoring, you can probably just check your processes' memory usage and curb them long before they touch swap.

danw1979 4 days ago | parent | prev | next [-]

Amen to failing fast.

A machine that is responding just enough to keep a circuit breaker closed is the scourge of distributed systems.

heavyset_go 4 days ago | parent | prev [-]

Maybe I'm just insane, but if I'm on a machine with ample memory, and a process for some reason can't allocate resources, I want that process to fail ASAP. Same thing with high memory pressure situations, just kill greedy/hungry processes, please.

Like something is going very wrong if the system is in that state, so I want everything to die immediately.

gfv 4 days ago | parent | next [-]

sysctl vm.overcommit_memory=2. However, programs for *nix-based systems usually expect overcommit to be on, for example, to support fork(). This is a stark contrast with Windows NT model, where an allocation will fail if it doesn't fit in the remaining memory+swap.

man8alexd 4 days ago | parent [-]

People disable memory overcommit, expecting to fix OOMs, and then they get surprised when their programs start failing mallocs while there are still tons of discardable page cache in the system.

https://unix.stackexchange.com/q/797835/1027 https://unix.stackexchange.com/q/797841/1027

cmurf 3 days ago | parent | prev [-]

systems-oomd does this.

The kernel oom killer is concerned with kernel survival, not user space performance.