| ▲ | subhobroto 4 hours ago | |
You've given us some idea of the volume of your data but there's no mention of what's ingesting it or how. > during these stress tests the hardware is nowhere close to over-encumbered, and there's consistent headroom on both memory, CPU and disk I/O This assertion is likely wrong - you're likely skipping over some metrics that has clues to what we need to know. Here are some questions to get the discussion moving. - Is this PostgreSQL managed or self-hosted? Your mention of "consistent headroom on both memory, CPU and disk I/O" gives me hope you're self-hosting it but I've heard the same thing in the past from people attempting to use RDS and wondering the same as you are, so no assumptions. - Are you using COPY or multi-row INSERT statements? - How much RAM does that server have? - What is the fillfactor, max_wal_size and checkpoint_timeout? - Is the WAL on NVMe? - What's the iostat or wa during the slowdown? - Are random UUIDs (part of) the index? Have you posted to https://dba.stackexchange.com/ If I were you, I would create a GitHub repo that has scripts that synthesize the data and reproduce the issues you're seeing. | ||