Some time ago I wrote blogpost which showed how to list tables that should be autovacuumed or autoanalyzed.
Query in there had one important problem – it didn't take into account per-table settings.
Specifically – it only used system-wide values for:
- autovacuum_analyze_scale_factor
- autovacuum_analyze_threshold
- autovacuum_vacuum_scale_factor
- autovacuum_vacuum_threshold
but these can be also set per table using syntax like:
=$ ALTER TABLE t2 SET ( autovacuum_vacuum_scale_factor = 0.3, autovacuum_vacuum_threshold = 200 );
Let's fix it.
To get per-table settings you have to look into reloptions column in pg_class table.
Appropriate query that gets real values of factors and thresholds looks like:
=$ SELECT
n.nspname,
c.relname,
c.oid AS relid,
c.reltuples,
s.n_dead_tup,
s.n_mod_since_analyze,
COALESCE(
(SELECT split_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum_analyze_scale_factor=' ),
current_setting('autovacuum_analyze_scale_factor')
)::float8 AS autovacuum_analyze_scale_factor,
COALESCE(
(SELECT split_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum_analyze_threshold=' ),
current_setting('autovacuum_analyze_threshold')
)::float8 AS autovacuum_analyze_threshold,
COALESCE(
(SELECT split_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum_vacuum_scale_factor=' ),
current_setting('autovacuum_vacuum_scale_factor')
)::float8 AS autovacuum_vacuum_scale_factor,
COALESCE(
(SELECT split_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum_vacuum_threshold=' ),
current_setting('autovacuum_vacuum_threshold')
)::float8 AS autovacuum_vacuum_threshold
FROM
pg_class c
JOIN pg_namespace n ON c.relnamespace = n.oid
LEFT OUTER JOIN pg_stat_all_tables s ON c.oid = s.relid
WHERE c.relkind = 'r'
ORDER BY n.nspname, c.relname;
The magic to get appropriate values in in lines 8-23, but it's just 4 separate blocks, one for each variable. Let's look at one of them:
COALESCE(
(SELECT split_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum_analyze_scale_factor=' ),
current_setting('autovacuum_analyze_scale_factor')
)::float8 AS autovacuum_analyze_scale_factor
SELECT in line 2 unpacks reloptions (it's array of texts), and gets only the row that starts with “autovacuum_analyze_scale_factor=“. Then, value is split using = character, and it returns second part – which is the value.
If reloptions do not contain autovacuum_analyze_scale_factor, then the whole subselect in line 2 will return NULL, which will then get passed to COALESCE, which will substitute global value of autovacuum_analyze_scale_factor using current_setting call.
Finally in line 4 I added cast to float8, so that the value will be useful for math.
In my case, with these tables:
relname │ reloptions ─────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── t1 │ {autovacuum_enabled=off,autovacuum_analyze_scale_factor=0.2,autovacuum_analyze_threshold=100} t2 │ {autovacuum_enabled=off,autovacuum_vacuum_scale_factor=0.3,autovacuum_vacuum_threshold=200} t3 │ {autovacuum_enabled=off,autovacuum_analyze_scale_factor=0.2,autovacuum_analyze_threshold=100,autovacuum_vacuum_scale_factor=0.3,autovacuum_vacuum_threshold=200} t4 │ [NULL] (4 ROWS)
and these global settings:
name │ setting ─────────────────────────────────┼───────── autovacuum_analyze_scale_factor │ 0.1 autovacuum_analyze_threshold │ 50 autovacuum_vacuum_scale_factor │ 0.2 autovacuum_vacuum_threshold │ 50 (4 ROWS)
The query from above shows:
nspname │ relname │ relid │ reltuples │ n_dead_tup │ n_mod_since_analyze │ autovacuum_analyze_scale_factor │ autovacuum_analyze_threshold │ autovacuum_vacuum_scale_factor │ autovacuum_vacuum_threshold ─────────┼─────────┼───────┼───────────┼────────────┼─────────────────────┼─────────────────────────────────┼──────────────────────────────┼────────────────────────────────┼───────────────────────────── public │ t1 │ 51871 │ 0 │ 1052 │ 103453 │ 0.2 │ 100 │ 0.2 │ 50 public │ t2 │ 51881 │ 0 │ 10128 │ 112529 │ 0.1 │ 50 │ 0.3 │ 200 public │ t3 │ 51891 │ 0 │ 51360 │ 153761 │ 0.2 │ 100 │ 0.3 │ 200 public │ t4 │ 51901 │ 102401 │ 0 │ 0 │ 0.1 │ 50 │ 0.2 │ 50 (4 ROWS)
Now, with this in place, I can rewrite view from original post to:
=$ CREATE VIEW autovacuum_queue AS WITH s AS ( SELECT n.nspname, c.relname, c.oid AS relid, c.reltuples, s.n_dead_tup, s.n_mod_since_analyze, COALESCE( (SELECT split_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum_analyze_scale_factor=' ), current_setting('autovacuum_analyze_scale_factor') )::float8 AS analyze_factor, COALESCE( (SELECT split_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum_analyze_threshold=' ), current_setting('autovacuum_analyze_threshold') )::float8 AS analyze_threshold, COALESCE( (SELECT split_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum_vacuum_scale_factor=' ), current_setting('autovacuum_vacuum_scale_factor') )::float8 AS vacuum_factor, COALESCE( (SELECT split_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum_vacuum_threshold=' ), current_setting('autovacuum_vacuum_threshold') )::float8 AS vacuum_threshold FROM pg_class c JOIN pg_namespace n ON c.relnamespace = n.oid LEFT OUTER JOIN pg_stat_all_tables s ON c.oid = s.relid WHERE c.relkind = 'r' ), tt AS ( SELECT nspname, relname, relid, n_dead_tup, n_mod_since_analyze, reltuples * vacuum_factor + vacuum_threshold AS v_threshold, reltuples * analyze_factor + analyze_threshold AS a_threshold FROM s ) SELECT nspname, relname, relid, CASE WHEN n_dead_tup > v_threshold THEN 'VACUUM' ELSE '' END AS do_vacuum, CASE WHEN n_mod_since_analyze > a_threshold THEN 'ANALYZE' ELSE '' END AS do_analyze FROM tt WHERE n_dead_tup > v_threshold OR n_mod_since_analyze > a_threshold;
and with this, I can:
=$ SELECT *, pg_relation_size(relid), pg_total_relation_size(relid) FROM autovacuum_queue ORDER BY pg_total_relation_size(relid) DESC ; nspname │ relname │ relid │ do_vacuum │ do_analyze │ pg_relation_size │ pg_total_relation_size ────────────┼──────────────┼───────┼───────────┼────────────┼──────────────────┼──────────────────────── public │ t3 │ 51891 │ VACUUM │ ANALYZE │ 7159808 │ 11812864 public │ t2 │ 51881 │ VACUUM │ ANALYZE │ 5128192 │ 7987200 public │ t1 │ 51871 │ VACUUM │ ANALYZE │ 4677632 │ 7020544 pg_catalog │ pg_statistic │ 2619 │ │ ANALYZE │ 196608 │ 344064 (4 ROWS)
Hope it helps.
pg_statistic gets in there (for “ANALYZE”) — should be excluded, I guess.