Waiting for 9.0 – pg_upgrade

On May, 12ve, Bruce Momjian committed new contrib module for 9.0 – pg_upgrage.

As I understand – this is what was available before as pg-migrator.

If you're not familiar with it – it's a tool that allows upgrade of $PGDATA from some version to some version. What's the use case? Let's assume you have this 200GB database working as 8.3, and you'd like to go to 8.4 (or 9.0). Normal way is pg_dump + pg_restore – which will take some time. With pg-migrate/pg_upgrade it should be faster, and easier. So, let's play with it.

Continue reading Waiting for 9.0 – pg_upgrade

Stupid tricks – hiding value of column in select *

One of the most common questions is “how do I get select * from table, but without one of the column".

Short answer is of course – name your columns, instead of using *. Or use a view.

But I decided to take a look at the problem.

Continue reading Stupid tricks – hiding value of column in select *

Getting unique elements

Let's assume you have some simple database with “articles" – each article can be in many “categories". And now you want to get list of all articles in given set of categories.

Standard approach:

SELECT
    a.*
FROM
    articles AS a
    JOIN articles_in_categories AS aic ON a.id = aic.article_id
WHERE
    aic.category_id IN (14,62,70,53,138)

Will return duplicated article data if given article is in more than one from listed categories. How to remove redundant rows?

Continue reading Getting unique elements

Tips n’ Tricks – using “wrong” index

More than once I've seen situation when there is a table, with serial primary key, and rows contain also some kind of creation timestamp, which is usually monotonic, or close to monotonic.

Example of such case are for example comments or posts in forums – each get it's ID, but they also have creation timestamp. And it usually is so that higher ids were added later than the lower ids.

So, let's assume you have such table, and somebody asks you to make a report on data from last month. How?

Continue reading Tips n' Tricks – using “wrong" index

How to remove backups?

Question from title sounds weird to you? It's just a ‘rm backup_filename'? Well. I really wish it was so simple in some cases.

One of the servers I'm looking into, there is interesting situation:

  • quite busy database server (2k tps is the low point of the day)
  • very beefy hardware
  • daily backups, each sized at about 100GB
  • backups stored on ext3 filesystem with default options
  • before launching daily backup, script removes oldest backup (we keep 3 days of backups on this machine)

Continue reading How to remove backups?

Profiling stored procedures/functions

One database that I am monitoring uses a lot of stored procedures. Some of them are fast, some of them are not so fast. I thought – is there a sensible way to diagnose which part of stored procedure take the most time?

I mean – I could just put the logic into application, and then every query would have it's own timing in Pg logs, but this is not practical. And I also believe that using stored procedures/functions is way better than using plain SQL due to a number of reasons.

So, I'm back to question – how to check which part of function takes most of the time?

Continue reading Profiling stored procedures/functions

Setting WAL Replication

There are several approaches on replication/failover – you might have heard of Slony, Londiste, pgPool and some other tools.

WAL Replication is different from all of them in one aspect – it doesn't let you query slave database (until 9.0, in which you actually can run read only queries on slave.

Since you can't run queries on slave, what is it good for? Well. It's good, and great in 1 very important aspect – all things that happen in database are replicated. Schema changes. Sequence modifications. Everything.

There is also drawback – you can't (as of now) replicate just one database. You replicate whole cluster (I don't like this word in this context – let's say: whole installation) of PostgreSQL. All databases that reside in given DATA directory.

So, the question is – how to set it up?

Continue reading Setting WAL Replication

Stupid tricks – Dynamic updates of fields in NEW in PL/pgSQL

Dynamic updates of fields in NEW in PL/pgSQL

Today, on #postgresql on IRC, strk asked about updating fields in NEW record, in plpgsql, but where name of the field is in variable.

After some time, he sent his question to hackers mailing list. And he got prompt reply that it's not possible.

Well, I dare to disagree.

Continue reading Stupid tricks – Dynamic updates of fields in NEW in PL/pgSQL