Postgres database and table storage and index optimizations
Optimizating database storage is always a key focus area when running databases. Lets get started with some basics.
Database storage used
Run the following query to see how much space your database currently using
And you may get further break down here:-
Table size spaces used
Let's see how much space used between data and index
And we may get the following outputs. We can see how much actual data stored in the data (data_size) and how much taken up by the index_size. It is important to keep index size smaller because this take up database memory as index are loaded into memory to speed up query.
The index size is ok, not great.
Take a look at this index vs data composition
You will notice that index size is 14G while data size is only 229M. This is quite alarming and a clear indication that your index is bloated and action must be taken to fix this
Here we can
1. Reindex table
REINDEX TABLE CONCURRENTLY KC_ISPN__offlineSessions
A standard REINDEX locks the table. Adding CONCURRENTLY builds a brand new, perfectly packed index in the background while your application continues to read and write. Once the new index is fully built, Postgres seamlessly swaps it with the old, bloated one and drops the old one.
Please note: This will generate WAL (database logs) because it is writing a new index, but it is the only safe way to remove bloat in production without extensions.
2. Reconfigure table for more optimized clean up. For high curning table, the vacuum threshold here is less significant.
3. Use VACUUM to clean table and Index. This is closely related to configuration above - instead of waiting for the clean up criteria to complete, we initate vacuum manually.
VACUUM is used to clean up data tuple (record that has been marked as deleted).
Please note: VACUUM generate WAL too. WAL file are chunked to 16M per file and as long as we give some room for postgres to catch up, these files are clean up automatically. In Azure portal you can see this via its metrics called "Transaction Log Used"
You can use the following sql to check size of WAL
SELECT pg_size_pretty(sum(size)) AS total_wal_size FROM pg_ls_waldir();
We should avoid using VACUUM FULL, instead we can use
Cost | How it is addresses |
|---|---|
No Locking | Standard VACUUM acquires only a Share Update Exclusive lock. It does not block SELECT, INSERT, UPDATE, or DELETE operations. |
Low I/O Impact | By lowering vacuum_cost_limit (default is usually 200) and increasing vacuum_cost_delay, you force the vacuum process to "sleep" frequently. This leaves I/O bandwidth available for your application queries. |
Modest WAL | Standard VACUUM does not rewrite data pages. It only updates the Visibility Map and Free Space Map. This generates negligible WAL compared to VACUUM FULL (which logs every row move). |
Comments