On 14th of October Tome Lane committed new patch from Itagaki Tahahiro:
Log Message: ----------- Support SQL-compliant triggers ON COLUMNS, ie fire ONLY IF certain COLUMNS are named IN the UPDATE's SET list. Note: the schema of pg_trigger has not actually changed; we've just started TO USE a COLUMN that was there ALL along. catversion bumped anyway so that this commit IS included IN the history OF potentially interesting changes TO system catalog contents. Itagaki Takahiro
Up to this moment, when you write a trigger on update (on INSERT and DELETE the COLUMN based triggers are obviously useless), it is ran whenever you issue update to any row (or no row at all!) in table.
But sometimes – you might want to run some trigger only when given field changes – for example: change tsvector column for TSearch2 when title or body changes, but not when validity status changes, or number of views – in this situation we'd want to skip running tsvector changes when title and body are the same because recalculation of tsvector is quite expensive. In other case it might be that trigger on update creates new update query to another table, and we want not to call it when it's not needed.
Let's see how it looks. Test table:
# \d test Table "public.test" +-----------+----------+---------------------------------------------------+ | Column | Type | Modifiers | +-----------+----------+---------------------------------------------------+ | id | integer | not null default nextval('test_id_seq'::regclass) | | views | integer | not null default 0 | | title | text | | | body | text | | | is_active | boolean | not null default false | | fts_data | tsvector | | +-----------+----------+---------------------------------------------------+ Indexes: "test_pkey" PRIMARY KEY, btree (id)
So, in 8.4, trigger to update ftp_data would look like this:
CREATE OR REPLACE FUNCTION trg_test_i() RETURNS TRIGGER AS $$ BEGIN NEW.fts_data := to_tsvector( COALESCE( NEW.title, '' ) || ' ' || COALESCE( NEW.body || '' ) ); RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER trg_test_i BEFORE INSERT ON test FOR EACH ROW EXECUTE PROCEDURE trg_test_i();
This is obligatory trigger on INSERT, which will be the same. And trigger on UPDATE would look like this:
CREATE OR REPLACE FUNCTION trg_test_u() RETURNS TRIGGER AS $$ BEGIN IF NEW.title IS DISTINCT FROM OLD.title OR NEW.body IS DISTINCT FROM OLD.BODY THEN NEW.fts_data := to_tsvector( COALESCE( NEW.title, '' ) || ' ' || COALESCE( NEW.body || '' ) ); END IF; RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER trg_test_u BEFORE UPDATE ON test FOR EACH ROW EXECUTE PROCEDURE trg_test_u();
This of course work nicely, but has several issues:
- it's easy to forget to add IF
- it becomes more and more complex in case we have several such relations on single table
- while easy to read for seasoned dba, the IF will definitely raise questions from new DBAs
Now, luckily it is no longer needed. Instead of the trigger above I can just:
CREATE OR REPLACE FUNCTION trg_test_u() RETURNS TRIGGER AS $$ BEGIN NEW.fts_data := to_tsvector( COALESCE( NEW.title, '' ) || ' ' || COALESCE( NEW.body || '' ) ); RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER trg_test_u BEFORE UPDATE OF title, body ON test FOR EACH ROW EXECUTE PROCEDURE trg_test_u();
Which is the same code as in trg_test_i() function.
Also – at least theoretically – new approach should be faster. Let's see if it really is that way.
So, let's test:
CREATE TABLE test ( id serial PRIMARY KEY, irrelevant INT4, relevant INT4, updates INT4 DEFAULT 0 );
Now, let's insert some dummy data:
# INSERT INTO test (id, irrelevant, relevant, updates) SELECT i, i, i, 0 FROM generate_series(1,100000) i; INSERT 0 100000
And let's write test sql script:
perl -e 'printf("UPDATE test SET %s = %s + 1 WHERE id = %i;\n",$_%10?"irrelevant":"relevant",$_%10?"irrelevant":"relevant", $_) for (1..100000)' > testit.sql
File looks like this:
=$ head -n 21 testit.sql UPDATE test SET irrelevant = irrelevant + 1 WHERE id = 1; UPDATE test SET irrelevant = irrelevant + 1 WHERE id = 2; UPDATE test SET irrelevant = irrelevant + 1 WHERE id = 3; UPDATE test SET irrelevant = irrelevant + 1 WHERE id = 4; UPDATE test SET irrelevant = irrelevant + 1 WHERE id = 5; UPDATE test SET irrelevant = irrelevant + 1 WHERE id = 6; UPDATE test SET irrelevant = irrelevant + 1 WHERE id = 7; UPDATE test SET irrelevant = irrelevant + 1 WHERE id = 8; UPDATE test SET irrelevant = irrelevant + 1 WHERE id = 9; UPDATE test SET relevant = relevant + 1 WHERE id = 10; UPDATE test SET irrelevant = irrelevant + 1 WHERE id = 11; UPDATE test SET irrelevant = irrelevant + 1 WHERE id = 12; UPDATE test SET irrelevant = irrelevant + 1 WHERE id = 13; UPDATE test SET irrelevant = irrelevant + 1 WHERE id = 14; UPDATE test SET irrelevant = irrelevant + 1 WHERE id = 15; UPDATE test SET irrelevant = irrelevant + 1 WHERE id = 16; UPDATE test SET irrelevant = irrelevant + 1 WHERE id = 17; UPDATE test SET irrelevant = irrelevant + 1 WHERE id = 18; UPDATE test SET irrelevant = irrelevant + 1 WHERE id = 19; UPDATE test SET relevant = relevant + 1 WHERE id = 20; UPDATE test SET irrelevant = irrelevant + 1 WHERE id = 21;
OK. So, first let's create “old-fashioned" trigger:
CREATE OR REPLACE FUNCTION test_trigger() RETURNS TRIGGER AS $BODY$ BEGIN IF NEW.relevant IS DISTINCT FROM OLD.relevant THEN NEW.updates := NEW.updates + 1; END IF; RETURN NEW; END; $BODY$ LANGUAGE 'plpgsql'; CREATE TRIGGER test_trigger BEFORE UPDATE ON test FOR EACH ROW EXECUTE PROCEDURE test_trigger();
And now let's test the testit.sql (100k updates, 10% of which are further processed with very simple plpgsql code). I ran the test 5 times, each time cleaning first the database. Times:
real 0m39.923s real 0m39.122s real 0m38.569s real 0m38.701s real 0m38.225s
And how it is with new, “ON COLUMN" trigger:
real 0m38.842s real 0m38.639s real 0m36.495s real 0m36.478s real 0m37.522s
Pretty similar new way is a bit faster ( 38.908 vs. 37.595 seconds on average ), so performance will not be important factor when choosing this new approach – but readability of code definitely will.
Cool stuff.
so maybe I’ve read this too fast and it was answered… (I think your code probably answers but…)
Support SQL-compliant triggers on columns, ie fire only if certain columns
are named in the UPDATE’s SET list.
so it won’t fire if I don’t name the column? or just generally do an insert on that column even if I name it?
@Caleb: according to PostgreSQL 8.5devel Docs, no. it won’t fire if you don’t name the column.
in particular, this means that a trigger defined as
CREATE TRIGGER trigger1
BEFORE UPDATE OF column1 ON table1
FOR EACH ROW EXECUTE PROCEDURE trigger1();
will *not* fire when column1 is updated by another trigger.
for me it’s a bit weird but probably it’s standards-compliant, if pg-gods decided this way.
Good fill someone in on and this post helped me alot in my college assignement. Thank you as your information.