Profiling stored procedures/functions

One database that I am monitoring uses a lot of stored procedures. Some of them are fast, some of them are not so fast. I thought – is there a sensible way to diagnose which part of stored procedure take the most time?

I mean – I could just put the logic into application, and then every query would have it's own timing in Pg logs, but this is not practical. And I also believe that using stored procedures/functions is way better than using plain SQL due to a number of reasons.

So, I'm back to question – how to check which part of function takes most of the time?

Let's assume we have very simplistic function:

CREATE OR REPLACE FUNCTION testit() RETURNS void AS $$
DECLARE
    x INT4;
    y INT4;
BEGIN
    x := 3;
    y := x * 2;
    RETURN;
END;
$$ LANGUAGE plpgsql;

This functions doesn't do anything, but that's my point – I want to measure how long does it take to add profiling, and how fast this profiling is.

Generally, the idea is to put RAISE “something" that will log information to logs, without polluting client application with useless information.

Before I will go to this, let's get a base point – let's run this function – let's say 10000 times, and check time:

# SELECT count(*) FROM (
    SELECT testit() from generate_series(1,10000)
) as x;
 count
-------
 10000
(1 row)
 
Time: 24.486 ms

Oops. A bit too fast to be sensibly comparable. Let's add some more iterations:

# SELECT count(*) FROM (
    SELECT testit() from generate_series(1,1000000)
) as x;
  count
---------
 1000000
(1 row)
 
Time: 2801.626 ms

Better.

I ran it also 2 more times, and got results:

2766.251 ms
3153.510 ms

OK, So now, let's add profiling. First I need to find a log level that will go to logs, and not on screen:

# show client_min_messages ;
 client_min_messages 
---------------------
 notice
(1 row)
 
Time: 0.093 ms
(depesz@[local]:5850) 11:26:55 [depesz] 
# show log_min_messages ;
 log_min_messages 
------------------
 warning
(1 row)

Now, thanks to fine documentation we can see that there is ‘LOG' log-level that will get logged but not shown on screen.

So, knowing this, let's add some profiling. It should be as simple as possible, yet have all necessary information:

CREATE OR REPLACE FUNCTION testit() RETURNS void AS $$
declare
    x INT4;
    y INT4;
BEGIN
    RAISE LOG 'PROFILING: testit() Step 1 : %', clock_timestamp();
    x := 3;
    RAISE LOG 'PROFILING: testit() Step 2 : %', clock_timestamp();
    y := x * 2;
    RAISE LOG 'PROFILING: testit() Step 3 : %', clock_timestamp();
    RETURN;
END;
$$ LANGUAGE plpgsql;

As you can see I do log ‘PROFILING:' to be able to quickly grep the logs, name of the function (because I might be profiling more than 1 function at a time, and which step it was on. Plus a very precise timestamp.

After single run, I have in logs:

2010-03-18 11:37:00.003 CET,"depesz","depesz",7949,"[local]",4ba1fdd7.1f0d,7721795,"SELECT",2010-03-18 11:17:59 CET,2/38,0,LOG,00000,"PROFILING: testit() Step 1 : 2010-03-18 11:37:00.003352+01",,,,,,"SELECT count(*) FROM (SELECT testit() from generate_series(1,1000000)) as 
x;",,,"psql"
2010-03-18 11:37:00.003 CET,"depesz","depesz",7949,"[local]",4ba1fdd7.1f0d,7721796,"SELECT",2010-03-18 11:17:59 CET,2/38,0,LOG,00000,"PROFILING: testit() Step 2 : 2010-03-18 11:37:00.003367+01",,,,,,"SELECT count(*) FROM (SELECT testit() from generate_series(1,1000000)) as 
x;",,,"psql"
2010-03-18 11:37:00.003 CET,"depesz","depesz",7949,"[local]",4ba1fdd7.1f0d,7721797,"SELECT",2010-03-18 11:17:59 CET,2/38,0,LOG,00000,"PROFILING: testit() Step 3 : 2010-03-18 11:37:00.003383+01",,,,,,"SELECT count(*) FROM (SELECT testit() from generate_series(1,1000000)) as 
x;",,,"psql"

Which can be simplified to only relevant information:

7949,"PROFILING: testit() Step 1 : 2010-03-18 11:37:00.003352+01"
7949,"PROFILING: testit() Step 2 : 2010-03-18 11:37:00.003367+01"
7949,"PROFILING: testit() Step 3 : 2010-03-18 11:37:00.003383+01"

First column is important, as it is pid of backend process, and we need it so that we will not mix profiling messages from various backends.

So. How much slower it is?

SELECT count(*) FROM (
    SELECT testit() from generate_series(1,1000000)
) as x;
  count  
---------
 1000000
(1 row)
 
Time: 50819.932 ms

That seems to be a lot in comparison with 3s previously. But is it? Let's thing: within those 51 seconds, for 3 second we were doing “normal stuff, not profiling related". So it leaves with 48 seconds of overhead. 48 seconds for 1 million calls, each call was 3 raise log command – this means that single raise log took 0.016ms!

Now. In our degenerated example – it's a lot. But our function doesn't do anything. Let's try it on something *a bit* more realistic. We'll test 2 functions:

CREATE OR REPLACE FUNCTION insert_word( in_word TEXT ) RETURNS void as $$
DECLARE
    use_word_id INT4;
BEGIN
    use_word_id := nextval('words_id_seq');
    INSERT INTO words (id, word) VALUES (use_word_id, in_word);
    LOOP
        UPDATE word_counts SET count = count + 1 WHERE word = in_word;
        IF found THEN
            RETURN;
        END IF;
        BEGIN
            INSERT INTO word_counts (word, count) VALUES (in_word, 1);
            RETURN;
        EXCEPTION WHEN unique_violation THEN
        END;
    END LOOP;
END;
$$ language plpgsql;
 
CREATE OR REPLACE FUNCTION insert_word2( in_word TEXT ) RETURNS void as $$
DECLARE
    use_word_id INT4;
BEGIN
    INSERT INTO words (word) VALUES (in_word) returning id INTO use_word_id;
    LOOP
        UPDATE word_counts SET count = count + 1 WHERE word = in_word;
        IF found THEN
            RETURN;
        END IF;
        BEGIN
            INSERT INTO word_counts (word, count) VALUES (in_word, 1);
            RETURN;
        EXCEPTION WHEN unique_violation THEN
        END;
    END LOOP;
END;
$$ language plpgsql;

These functions relate to 2 tables that will need to be created as well:

CREATE TABLE words (
    id serial PRIMARY KEY,
    word TEXT NOT NULL
);
CREATE TABLE word_counts (
    word TEXT PRIMARY KEY,
    count INT4 NOT NULL DEFAULT 0
);

Now. Functionality of these functions is not really relevant. Let's just see how well do they behave without profiling, and then with profiling.

So, I created test table with 72900 words, and run this:

# SELECT count(*) FROM (
    SELECT insert_word( x.y) FROM x
) as q;

Results?

Time: 4638.691 ms
Time: 4665.952 ms
Time: 4667.546 ms
Time: 4662.410 ms
Time: 4660.251 ms

OK. Now the second function:

# SELECT count(*) FROM (
    SELECT insert_word2( x.y) FROM x
) as q;

And its results:

Time: 5023.776 ms
Time: 4819.380 ms
Time: 4656.849 ms
Time: 4631.946 ms
Time: 4613.677 ms

OK. Now, let's change the functions:

CREATE OR REPLACE FUNCTION insert_word( in_word TEXT ) RETURNS void as $$
DECLARE
    use_word_id INT4;
BEGIN
    RAISE LOG 'PROFILING: insert_word() Step 1 : %', clock_timestamp();
    use_word_id := nextval('words_id_seq');
    RAISE LOG 'PROFILING: insert_word() Step 2 : %', clock_timestamp();
    INSERT INTO words (id, word) VALUES (use_word_id, in_word);
    RAISE LOG 'PROFILING: insert_word() Step 3 : %', clock_timestamp();
    LOOP
        UPDATE word_counts SET count = count + 1 WHERE word = in_word;
        IF found THEN
            RAISE LOG 'PROFILING: insert_word() Step 4 : %', clock_timestamp();
            RETURN;
        END IF;
        RAISE LOG 'PROFILING: insert_word() Step 5 : %', clock_timestamp();
        BEGIN
            INSERT INTO word_counts (word, count) VALUES (in_word, 1);
            RAISE LOG 'PROFILING: insert_word() Step 6 : %', clock_timestamp();
            RETURN;
        EXCEPTION WHEN unique_violation THEN
            RAISE LOG 'PROFILING: insert_word() Step 7 : %', clock_timestamp();
        END;
    END LOOP;
END;
$$ language plpgsql;
 
CREATE OR REPLACE FUNCTION insert_word2( in_word TEXT ) RETURNS void as $$
DECLARE
    use_word_id INT4;
BEGIN
    RAISE LOG 'PROFILING: insert_word2() Step 1 : %', clock_timestamp();
    INSERT INTO words (word) VALUES (in_word) returning id INTO use_word_id;
    RAISE LOG 'PROFILING: insert_word2() Step 2 : %', clock_timestamp();
    LOOP
        UPDATE word_counts SET count = count + 1 WHERE word = in_word;
        IF found THEN
            RAISE LOG 'PROFILING: insert_word2() Step 3 : %', clock_timestamp();
            RETURN;
        END IF;
        RAISE LOG 'PROFILING: insert_word2() Step 4 : %', clock_timestamp();
        BEGIN
            INSERT INTO word_counts (word, count) VALUES (in_word, 1);
            RAISE LOG 'PROFILING: insert_word2() Step 5 : %', clock_timestamp();
            RETURN;
        EXCEPTION WHEN unique_violation THEN
            RAISE LOG 'PROFILING: insert_word2() Step 6 : %', clock_timestamp();
        END;
    END LOOP;
END;
$$ language plpgsql;

Now. I added 6 and 7 calls to profiling per function. Since each function is called 72900 times, it should make the calls 8 (insert_word) or 7 (insert_word2) seconds slower.

How does it look in reality?

insert_word() * 72900:

Time: 14767.158 ms
Time: 12476.824 ms
Time: 12517.576 ms
Time: 12237.033 ms
Time: 12181.778 ms

and insert_word2() * 72900:

Time: 11001.922 ms
Time: 11689.947 ms
Time: 11148.601 ms
Time: 15607.282 ms
Time: 17824.019 ms

Quite good – practically on spot with previous calculation. There were 2 problematic runs that took much longer time – 15 and 17 seconds instead of 11, but I think it was related to some side-generated load on my machine.

If this seem to much to you, think about it: Let's assume you have 150ms function. And you add to it 20 profiling lines. Total cost of profiling is 0.3ms! So of course – it is not usable for fast functions, that run in less than 0.1ms, but these do not need profiling anyway.

Generally, I think that 10 profiling steps should be enough for every case. And let's assume that profiling can use up to 3% of function time. This means that we can sensibly profile functions as fast as 5.3ms per call! Which means – that for vast majority of uses – this kind of profiling does not put any measurable load on server.

Now. We did some profiled calls to function. What can we do with it?

First of all – assuming you're logging to csvlog (which I believe is the best option), we'll just load it to PostgreSQL.

I created a simple program that connected to database, and ran all of these 72900 inserts, in parallel (10 words), in random order, choosing randomly which function to use.

This program generated ~ 750MB of logs. But hey – it ran over 700 000 queries, each logging at the very least 3 messages (number of messages logged depends which code path have been chosen).

So. Let's load the logs to Pg. According to docs we just need a simple table, and then we can “COPY" the data to it. Let's try.

some create table, and copy, and 10 minutes later …

# select count(*) from postgres_log ;
  count  
---------
 3354488
(1 row)

Nice.

Now. Let's get some sensible data out of it.

Final query will be a bit complicated, so let's do it step by step. Also – since I have 3.3 million lines in log, when writing intermediary queries, I usually add “limit 100" to it to get the check faster, but I'm showing query without limit to let you understand it better.

First we need to get only the lines that relate to our profiling:

SELECT
    l.*
FROM
    postgres_log l
WHERE
    l.error_severity = 'LOG'
    AND l.message like 'PROFILING: %'

Great. Now. Let's just get relevant information from it, and – split it into many columns so that it will be easier to group later:

WITH simplified_splitted AS (
    SELECT
        l.process_id,
        regexp_split_to_array( l.message, E'\\s+' ) as message
    FROM
        postgres_log l
    WHERE
        l.error_severity = 'LOG'
        AND l.message like 'PROFILING: %'
)
SELECT
    process_id,
    message[2] as function_name,
    message[4] as step,
    cast( message[6] || ' ' || message[7] as timestamptz ) as logged_when
FROM
    simplified_splitted

Word of explanation – I used WITH, to do the regexp_split_to_array only once, and then in main query use already-splitted version of message.

Results of this query look like this:

 process_id | function_name  | step |          logged_when
------------+----------------+------+-------------------------------
      13219 | insert_word()  | 1    | 2010-03-18 12:50:20.985081+01
      13219 | insert_word()  | 2    | 2010-03-18 12:50:20.985369+01
      13219 | insert_word()  | 3    | 2010-03-18 12:50:20.985939+01
      13219 | insert_word()  | 5    | 2010-03-18 12:50:20.986539+01
      13219 | insert_word()  | 6    | 2010-03-18 12:50:20.98677+01
      13219 | insert_word2() | 1    | 2010-03-18 12:50:20.987871+01
      13219 | insert_word2() | 2    | 2010-03-18 12:50:20.988092+01
      13226 | insert_word()  | 1    | 2010-03-18 12:50:20.988143+01
      13219 | insert_word2() | 4    | 2010-03-18 12:50:20.988335+01
      13226 | insert_word()  | 2    | 2010-03-18 12:50:20.988383+01

Brilliant. Now, we can add some window functions to calculate times.

Generally, we need to group the record (well, grouping not as SQL “GROUP BY" but just “treat them like similar"), within group order by logged_when, and then calculate differences between previous row and current. Seems simple?

Let's do it 🙂 First, try to run it on first 20 rows:

WITH simplified_splitted AS (
    SELECT
        l.process_id,
        regexp_split_to_array( l.message, E'\\s+' ) as message
    FROM
        postgres_log l
    WHERE
        l.error_severity = 'LOG'
        AND l.message like 'PROFILING: %'
),
full_splitted AS (
    SELECT
        process_id,
        message[2] as function_name,
        message[4] as step,
        cast( message[6] || ' ' || message[7] as timestamptz ) as logged_when
    FROM
        simplified_splitted
)
SELECT
    f.*,
    lag( step ) over ( profiling ) as previous_step,
    lag( logged_when ) over ( profiling ) as previous_logged_when
FROM
    full_splitted f
WINDOW profiling AS (
    PARTITION BY f.process_id, f.function_name ORDER BY logged_when ASC
)
ORDER BY
    f.process_id,
    f.function_name,
    f.logged_when

Results look like this:

 process_id | function_name  | step |          logged_when          | previous_step |     previous_logged_when      
------------+----------------+------+-------------------------------+---------------+-------------------------------
      13219 | insert_word()  | 1    | 2010-03-18 12:50:20.985081+01 | [null]        | [null]
      13219 | insert_word()  | 2    | 2010-03-18 12:50:20.985369+01 | 1             | 2010-03-18 12:50:20.985081+01
      13219 | insert_word()  | 3    | 2010-03-18 12:50:20.985939+01 | 2             | 2010-03-18 12:50:20.985369+01
      13219 | insert_word()  | 5    | 2010-03-18 12:50:20.986539+01 | 3             | 2010-03-18 12:50:20.985939+01
      13219 | insert_word()  | 6    | 2010-03-18 12:50:20.98677+01  | 5             | 2010-03-18 12:50:20.986539+01
      13219 | insert_word2() | 1    | 2010-03-18 12:50:20.987871+01 | [null]        | [null]
      13219 | insert_word2() | 2    | 2010-03-18 12:50:20.988092+01 | 1             | 2010-03-18 12:50:20.987871+01
      13219 | insert_word2() | 4    | 2010-03-18 12:50:20.988335+01 | 2             | 2010-03-18 12:50:20.988092+01
      13219 | insert_word2() | 5    | 2010-03-18 12:50:20.988464+01 | 4             | 2010-03-18 12:50:20.988335+01
      13219 | insert_word2() | 1    | 2010-03-18 12:50:20.988905+01 | 5             | 2010-03-18 12:50:20.988464+01
      13219 | insert_word2() | 2    | 2010-03-18 12:50:20.988981+01 | 1             | 2010-03-18 12:50:20.988905+01
      13219 | insert_word2() | 4    | 2010-03-18 12:50:20.989045+01 | 2             | 2010-03-18 12:50:20.988981+01
      13219 | insert_word2() | 5    | 2010-03-18 12:50:20.989108+01 | 4             | 2010-03-18 12:50:20.989045+01
      13219 | insert_word2() | 1    | 2010-03-18 12:50:20.989436+01 | 5             | 2010-03-18 12:50:20.989108+01
      13219 | insert_word2() | 2    | 2010-03-18 12:50:20.98951+01  | 1             | 2010-03-18 12:50:20.989436+01
      13226 | insert_word()  | 1    | 2010-03-18 12:50:20.988143+01 | [null]        | [null]
      13226 | insert_word()  | 2    | 2010-03-18 12:50:20.988383+01 | 1             | 2010-03-18 12:50:20.988143+01
      13226 | insert_word()  | 3    | 2010-03-18 12:50:20.988747+01 | 2             | 2010-03-18 12:50:20.988383+01
      13226 | insert_word()  | 5    | 2010-03-18 12:50:20.989298+01 | 3             | 2010-03-18 12:50:20.988747+01
      13226 | insert_word()  | 6    | 2010-03-18 12:50:20.989448+01 | 5             | 2010-03-18 12:50:20.989298+01

Nice. Couple of notes – if current “step" is 1 – there is no point in calculating time difference because previous step in this backend, from the same function will be from different call to this function.

So, we will need to filter out rows with “step = 1" (but this has to be done in final query, not the ones that we have now in “WITH".

And besides this – let's just calculate time differences, and which step it really was:

WITH simplified_splitted AS (
    SELECT
        l.process_id,
        regexp_split_to_array( l.message, E'\\s+' ) as message
    FROM
        postgres_log l
    WHERE
        l.error_severity = 'LOG'
        AND l.message like 'PROFILING: %'
),
full_splitted AS (
    SELECT
        process_id,
        message[2] as function_name,
        message[4] as step,
        cast( message[6] || ' ' || message[7] as timestamptz ) as logged_when
    FROM
        simplified_splitted
    LIMIT 20
),
lagged_info AS (
    SELECT
        f.*,
        lag( step ) over ( profiling ) as previous_step,
        lag( logged_when ) over ( profiling ) as previous_logged_when
    FROM
        full_splitted f
    WINDOW profiling AS (
        PARTITION BY f.process_id, f.function_name ORDER BY logged_when ASC
    )
)
SELECT
    i.function_name,
    i.previous_step || ' -> ' || i.step,
    i.logged_when - i.previous_logged_when
FROM
    lagged_info i
WHERE
    i.step <> '1'

Results from some random 20 profiling rows look like this:

 function_name  | ?column? |    ?column?     
----------------+----------+-----------------
 insert_word()  | [null]   | [null]
 insert_word()  | 2 -> 3   | 00:00:00.000071
 insert_word()  | 3 -> 4   | 00:00:00.000087
 insert_word()  | 1 -> 2   | 00:00:00.000042
 insert_word()  | 2 -> 3   | 00:00:00.000061
 insert_word()  | 3 -> 4   | 00:00:00.000067
 insert_word2() | 1 -> 2   | 00:00:00.000073
 insert_word2() | 2 -> 3   | 00:00:00.000108
 insert_word2() | 1 -> 2   | 00:00:00.000075
 insert_word2() | 2 -> 3   | 00:00:00.000204
 insert_word()  | 1 -> 2   | 00:00:00.000045
 insert_word()  | 2 -> 3   | 00:00:00.000056
 insert_word()  | 3 -> 4   | 00:00:00.000086
 insert_word2() | 1 -> 2   | 00:00:00.000079
 insert_word2() | 2 -> 3   | 00:00:00.000104
(15 rows)

So now, we can add full grouping, and present full statistics:

WITH simplified_splitted AS (
    SELECT
        l.process_id,
        regexp_split_to_array( l.message, E'\\s+' ) as message
    FROM
        postgres_log l
    WHERE
        l.error_severity = 'LOG'
        AND l.message like 'PROFILING: %'
),
full_splitted AS (
    SELECT
        process_id,
        message[2] as function_name,
        message[4] as step,
        cast( message[6] || ' ' || message[7] as timestamptz ) as logged_when
    FROM
        simplified_splitted
),
lagged_info AS (
    SELECT
        f.*,
        lag( step ) over ( profiling ) as previous_step,
        lag( logged_when ) over ( profiling ) as previous_logged_when
    FROM
        full_splitted f
    WINDOW profiling AS (
        PARTITION BY f.process_id, f.function_name ORDER BY logged_when ASC
    )
)
SELECT
    i.function_name,
    i.previous_step || ' -> ' || i.step as which_step,
    sum( i.logged_when - i.previous_logged_when ) as total_time,
    count(*) as total_count
FROM
    lagged_info i
WHERE
    i.step <> '1'
GROUP BY
    function_name, which_step
ORDER BY
    total_time DESC

Results (this time from all 3m rows) look like this:

 function_name  | which_step |   total_time    | total_count 
----------------+------------+-----------------+-------------
 insert_word2() | 2 -> 3     | 00:00:34.921153 |      327246
 insert_word()  | 3 -> 4     | 00:00:34.18836  |      328851
 insert_word2() | 1 -> 2     | 00:00:30.443333 |      363517
 insert_word()  | 2 -> 3     | 00:00:24.551921 |      365483
 insert_word()  | 1 -> 2     | 00:00:18.136481 |      365483
 insert_word2() | 4 -> 5     | 00:00:02.722998 |       36270
 insert_word2() | 2 -> 4     | 00:00:02.699597 |       36271
 insert_word()  | 5 -> 6     | 00:00:02.697111 |       36630
 insert_word()  | 3 -> 5     | 00:00:02.601562 |       36632
 insert_word()  | 5 -> 7     | 00:00:00.00527  |           2
 insert_word2() | 4 -> 6     | 00:00:00.000399 |           1
 insert_word()  | 7 -> 4     | 00:00:00.000154 |           2
 insert_word2() | 6 -> 3     | 00:00:00.000103 |           1
(13 rows)

Looks pretty sweet to me. Of course results are obvious, and there is not really anything I can do to make it faster with these functions, but it shows that the method really works.

11 thoughts on “Profiling stored procedures/functions”

Johan says:

2010-03-18 at 18:09

Thank you for this (and all your articles)! It allowed me to check and optimize a slow function (and revealed some really stupid thinking on my part) from 12 seconds to less than one!
Brett says:

2010-03-18 at 23:14

The edb-debugger module on pgfoundry also contains an excellent plpgsql profiling tool, which allows you to gather similar data (plus some). It’s less invasive (no code changes required) but you need to explicitly turn it on/off in each session to be traced.
I’m using it a bit to discover potential improvements in my plpgsql. By day I’m an Oracle developer, but I’ve found the postgres offering to be easier to use and more useful than Oracle’s dbms_profiler.
ash says:

2010-03-20 at 01:01

ever tried track_functions in postgresql 8.4?
depesz says:

2010-03-20 at 01:07

@ASH:
sure. not sure though how’s that relevant. track_functions tracks how many times function was called. This blogpost is about checking how much time parts of function take.
Tom says:

2010-04-01 at 12:27

Depesz WTF, this blog layout is awful. Bring back the old one.
depesz says:

2010-04-01 at 12:31

@TOM:
no way. 2 people complained about new layout (you and metys), and all other opinions i got were: “YAY, it’s so much better than previous”.
Matthew says:

2010-04-18 at 08:03

@depesz make that 3. At least people could choose the layout before.
depesz says:

2010-04-18 at 10:14

@Matthew:
heh, perhaps I should add a poll – “Do you like new layout” 🙂

But since, you made it to 3 – theme chooser is back 🙂
Matthew says:

2010-04-18 at 22:58

Yay! “AS 400” selected here.
Brian DeRocher says:

2011-09-24 at 16:14

That windowing stuff is a mind bender. How about this instead?

mark := clock_timestamp();
— stuff
RAISE LOG ‘Step 1 : %’, clock_timestamp() – mark;
depesz says:

2011-09-25 at 09:24

@Brian:
Sure, it’s possible, but then you’d have to add another mark := clock_timestamp(); *after* each raise log.
Plus – you have to do more stuff on runtime (more calls to clock_timestamp, substraction), which I wanted to avoid to make the overhead as small as possible.

Comments are closed.