Is there any good versioning package for database schema and/or data?

Like practically any application, applications that I deal with evolve. They change their requirements when it comes to database storage – new tables, new columns, modified columns. Or perhaps – new base data – like new values in virtually static dictionaries.

For past years we've been working with set of pl/plgsql functions which kept track of “patches", and their dependencies.

The problem with this approach is that it doesn't really scale well, and it generates problems when we use replication (Slony).

The problems mean that we have to apply the patches “by hand" on master/slave server, because downtime in the day is not acceptable, and nobody is willing to do upgrades at 3 am just to be able to add new column.

So, we deal with it. But lately I grew annoyed by this, and started to think about a better way to organize patches.

I “dream" about a system when I would write a patch itself, and the system will make it possible to install and uninstall it with automatic changing database to prior state (this is not simple with things like “update", but it is definitely possible).

System which would “understand" replication, and apply changes to all replicated servers in a way that it will be as safe as possible with no or minimal downtime.

System which would let me track dependencies, and then install them if I'll tell it to install patch, that requires some other patches that were not applied to target database.

Basically – I want something like apt-get/dpkg/rpm for database.

So, writing this seems to be perfectly possible, but is it necessary? Perhaps somebody someplace already wrote such system? Do you know any? Or should I stop whining, sit down and write it myself?

Text comparisons that does automatic trim()

SoftNum asked on irc:

< SoftNum> does postgresql have a config option to automatically trim (both ' ' from blah) on string compares?

So, can you?

Of course there is no such option, but maybe there is a way to tell PostgreSQL to do this trim for given field? Sure there is 🙂

Continue reading Text comparisons that does automatic trim()

Tips N’ Tricks – count of all and just some

Let's assume you have very simple table with users:

# \d users
                           Table "public.users"
  Column   |  Type   |                     Modifiers
-----------+---------+----------------------------------------------------
 id        | integer | not null default nextval('users_id_seq'::regclass)
 username  | text    | not null
 is_active | boolean | not null default true
Indexes:
    "users_pkey" PRIMARY KEY, btree (id)
    "users_username_key" UNIQUE, btree (username)

And you'd like to count all users, and know how many of them are active …

Continue reading Tips N' Tricks – count of all and just some

Waiting for 8.4 – Common Table Expressions (WITH queries)

On 4th of September Tom Lane committed another great patch. This one is very large, and even after applying – it's has some rough edges. There will be need for additional patches to make the functionality fully robust, but the fact that it got committed means that it will be available in final 8.4.

What does it do?

Continue reading Waiting for 8.4 – Common Table Expressions (WITH queries)

Waiting for 8.4 – new FSM (Free Space Map)

On 30th of September, Heikki Linnakangas committed his patch that changes FSM:

Rewrite the FSM. Instead of relying on a fixed-size shared memory segment, the
free space information is stored in a dedicated FSM relation fork, with each
relation (except for hash indexes; they don't use FSM).
 
This eliminates the max_fsm_relations and max_fsm_pages GUC options; remove any
trace of them from the backend, initdb, and documentation.
 
Rewrite contrib/pg_freespacemap to match the new FSM implementation. Also
introduce a new variant of the get_raw_page(regclass, int4, int4) function in
contrib/pageinspect that let's you to return pages from any relation fork, and
a new fsm_page_contents() function to inspect the new FSM pages.

Continue reading Waiting for 8.4 – new FSM (Free Space Map)

Waiting for 8.4 – database-level lc_collation and lc_ctype

On 23rd of September, Heikki Linnakangas committed patch that was written by Radek Strnad (actually committed patch is a stripped-down version of original).

What it does? It adds ability to have (finally!) different collation order and character categories for different databases.

Continue reading Waiting for 8.4 – database-level lc_collation and lc_ctype

Getting record by 2 criteria, ordered by third – how to do it quickly?

Let's assume we have a simple table:

CREATE TABLE objects (
    id          serial PRIMARY KEY,
    category    INT4        NOT NULL DEFAULT 0,
    object_type INT4        NOT NULL DEFAULT 0,
    entered_on  TIMESTAMPTZ NOT NULL DEFAULT now()
);

(This is simplification, but it contains all necessary columns).

What should I do to be able to quickly get 50 newest objects in given category/object_type (or in many categories/many object_types, or in all categories/object_types), optionally with limiting entered_on “older than …".

Continue reading Getting record by 2 criteria, ordered by third – how to do it quickly?

Waiting for 8.4 – ordered data loading in pg_dump

Great (and admittedly long overdue) patch by Tom Lane:

Make pg_dump --data-only try to order the table dumps so that foreign keys'
referenced tables are dumped before the referencing tables.  This avoids
failures when the data is loaded with the FK constraints already active.
If no such ordering is possible because of circular or self-referential
constraints, print a NOTICE to warn the user about it.

What it exactly means?

Continue reading Waiting for 8.4 – ordered data loading in pg_dump