| ▲ | tijsvd 10 hours ago | |
From what I understand in the follow up: postgres uses shared memory for buffers. This shared memory is read by a new connection while locked. In postgres, connections are handled with a process fork, not a new thread. If such a fork first reads memory, even if it already exists, that causes a minor page fault, which goes back to the kernel so it can update memory mapping tables. The operation under lock is only a few instructions, but if it takes longer than expected, then that causes lock contention. Regression in the kernel handling minor faults? The whole thing is then made worse because it's a spinlock, causing all waiting processes to contend over the cpus which adds to kernel processing. Mitigated by using huge pages, which dramatically reduces the number of mapping entries and faults. I reckon that it could also be mitigated in postgres by pre-faulting all shared memory early? | ||