Tips N’ Tricks – Generating readable reports with plain SQL

Let's say you imported some data, but it contains duplicates. You will have to handle them in some way, but to make sensible choice on how to handle it, you need more information.

So, let's start. We have table:

# \d users
                                    Table "public.users"
   Column   |           Type           |                     Modifiers
------------+--------------------------+----------------------------------------------------
 id         | integer                  | not null default nextval('users_id_seq'::regclass)
 username   | text                     |
 registered | timestamp with time zone |
Indexes:
    "users_pkey" PRIMARY KEY, btree (id)

Continue reading Tips N’ Tricks – Generating readable reports with plain SQL

explain.depesz.com – update

I just updated explain.depesz.com with 2 new interface features:

  • When you put mouse cursor over node, it will mark direct child nodes with
  • When you click on node, all child nodes (even indirect) will be hidden, and the node you clicked on will be marked with

Both features are directed towards people who analyze longer plans – ability to hide parts or just visually see what the Nested Loop is calling in 2nd stage greatly helps me see the flow of query – you can test it, for example, here.

By the way – BIG THANK YOU for Metys for help on it (well, actually, for doing it).

explain.depesz.com – update

I just modified the internals of explain.depesz.com. Now, it finally stores the plans in database (previously it stored the plans as files in dedicated directory).

Effect for enduser is just that history page should load faster.

But, having the data in database makes it possible to add more features.

One such feature is already added – ability to explain plan, but not list it on history page.

Right now, when you add new plan, you can specify if you want it to be listed with other previous explains – default value is “yes". But if you don't want just anybody to be able to click his way to your plan – there is option for it.

For obvious reasons all previous plans are marked as public now. If you want it to be changed, please contact me – we can probably do something about it.

And as last thing: I would like to express my big THANK YOU to all of you who use explain.depesz.com.

(side note: Catalyst is really cool)

explain.depesz.com – update

I just updated explain.depesz.com with the newest explain-parsing library version (Pg::Explain v 0.09).

This version will hit CPAN mirrors in next couple of hours.

Changes:

  • Fix exclusive time calculations
  • Make PE::Node understand Bitmap scans (heap and index)
  • Add proper handling of nodes that were “never executed"
  • Add ->is_analyzed method to PE::Node to make it easy to distinguish between EXPLAIN and EXPLAIN ANALYZE nodes

Getting list of all children in “adjacency list” tree structure

So, you have a table which looks like this:

# \d test
                           Table "public.test"
  Column   |  Type   |                     Modifiers
-----------+---------+---------------------------------------------------
 id        | integer | not null default nextval('test_id_seq'::regclass)
 parent_id | integer |
 x         | text    |
Indexes:
    "test_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
    "test_parent_id_fkey" FOREIGN KEY (parent_id) REFERENCES test(id)
Referenced by:
  "test_parent_id_fkey" IN test FOREIGN KEY (parent_id) REFERENCES test(id)

And you would like to easily get all children starting from given node?

Continue reading Getting list of all children in “adjacency list" tree structure

Waiting for 8.4 – no more -d in pg_dump!

Usually I write about new features in 8.4, but this time I'd like to write about feature that will be actually missing in 8.4. And thank God, it will be missing.

On Mon, 09 Mar 2009 11:22:47 -0400 Greg Sabino Mullane wrote mail to pgsql-hackers list with his patch that removes -d switch from pg_dump.

Later there was some discussion (20 mails) that extended the patch to remove also -D.

And now, today, Tom Lane committed:

Remove the -d and -D options of pg_dump and pg_dumpall.  The functionality
is still available, but you must now write the long equivalent --inserts
or --column-inserts.  This change is made to eliminate confusion with the
use of -d to specify a database name in most other Postgres client programs.
Original patch by Greg Mullane, modified per subsequent discussion.

This is great news. One less way a new user of pg (or one that doesn't read –help pages) can do himself harm, one less thing that is purely illogical.

find.best.tablespace.split.pl

Robert Treat reported an error with find.best.tablespace.split.pl. In some cases it could die with ‘division by zero' error.

Problem was solved using patch from Robert, which simply treats “0" in number of rows fetched or written, as ‘0.001'.

Additionally, I added a simpler way to specify multiple schemas to use tables from.

Code is available in svn repository.

lpad() and rpad() gotcha

I was lately writing some program for a client of mine, which used UPC codes matching.

Since the codes are given in various ways, there was decision to pad the codes with leading zeros – up to 12 characters.

The code has been done, and worked like this:

# SELECT lpad('123456789', 12, '0');
     lpad
--------------
 000123456789
(1 ROW)

Continue reading lpad() and rpad() gotcha