Remix.run Logo
dangoodmanUT 3 days ago

In a previous use case, when using postgres as a WAL-like append only store, I noticed that indexes would get massive. Then, after a while, they'd magically shrink. I had eventually switched to an API on top of Badger (golang KV), which afforded me an order of magnitude lower latency at ~30% of the resources IIRC. I'm sure there might have been some tuning I could have done to improve it.

I've also heard similar behaviors exhibited from other folks who had similar high-write workloads on postgres.

Sorry, I don't have anything super tangible to provide off the top of my head, or metrics/code I can share to recreate! It was also a project that required a lot of data to recreate the setup for.

petergeoghegan 3 days ago | parent [-]

> In a previous use case, when using postgres as a WAL-like append only store, I noticed that indexes would get massive. Then, after a while, they'd magically shrink.

It's possible to recycle pages within indexes that have some churn (e.g., with workloads that use bulk range deletions). But it's not possible for indexes to shrink on their own, in a way that can be observed by monitoring the output of psql's "\di+" command. For that you'd need to REINDEX or run VACUUM FULL.

dangoodmanUT 3 days ago | parent | next [-]

I may not be remembering fully, maybe the indexes never shrunk but the tables did in size.

Is there no way to automatically clean up indexes then?

petergeoghegan 3 days ago | parent [-]

> I may not be remembering fully, maybe the indexes never shrunk but the tables did in size.

That only happens when it is possible to give back space to the OS filesystem using relation truncation in the first place -- which isn't all that common (it's generally only seen when there are bulk range deletions that leave lots of contiguous empty space at the end of a table/heap structure). But you said that this is an append-only workload.

This behavior can be disabled by setting the vacuum_truncate table storage parameter to "off". This is useful with workloads where relation truncation is disruptive (truncation needs to acquire a very heavyweight table lock).

> Is there no way to automatically clean up indexes then?

What I meant was that indexes do not support relation truncation. It follows that the amount of space used for an index (from the point of view of the OS) cannot ever go down, barring a REINDEX or a VACUUM FULL.

This does not mean that we cannot reuse space for previously freed/deleted pages (as long as we're reusing that space for the same index). Nor does it mean that "clean up" isn't possible in any general sense.

jashmatthews 3 days ago | parent | prev [-]

Does vacuum not release free pages at the end of an index file in the same way it does for the heap?

petergeoghegan 3 days ago | parent [-]

No, it does not