luckymurali_81 on freenodes #postgresql had a problem. his query returns data in wrong order. what can we do about it?
his query:
SELECT yday.region,yday.yday,mtd.mtd,target.target FROM (SELECT dr.region,SUM(f.collection) AS yday FROM fact_collection f,dim_time dt,dim_region dr WHERE dt.bus_date=current_date-45 AND dt.time_id=f.time_id AND f.region_id=dr.region_id GROUP BY dr.region) AS yday, (SELECT dr.region,SUM(f.collection) AS mtd FROM fact_collection f,dim_time dt, dim_region dr WHERE dt.bus_date BETWEEN date_trunc('month',current_date-45) AND CURRENT_DATE AND dt.time_id=f.time_id AND f.region_id=dr.region_id GROUP BY dr.region) AS mtd, (SELECT dr.region,SUM(f.target)AS target FROM dim_region dr,fact_collection_target f ,dim_time dt WHERE dt.month_id=f.month_id AND dr.region_id=f.region_id AND dt.month_id=5 GROUP BY dr.region) AS target WHERE yday.region=mtd.region AND mtd.region=target.region UNION SELECT 'Total' AS region,SUM(foo.yday),SUM(foo.mtd),SUM(foo.target) FROM ( SELECT yday.region,yday.yday,mtd.mtd,target.target FROM (SELECT dr.region,SUM(f.collection) AS yday FROM fact_collection f,dim_time dt,dim_region dr WHERE dt.bus_date=current_date-45 AND dt.time_id=f.time_id AND f.region_id=dr.region_id GROUP BY dr.region) AS yday, (SELECT dr.region,SUM(f.collection) AS mtd FROM fact_collection f,dim_time dt, dim_region dr WHERE dt.bus_date BETWEEN date_trunc('month',current_date-45) AND CURRENT_DATE AND dt.time_id=f.time_id AND f.region_id=dr.region_id GROUP BY dr.region) AS mtd, (SELECT dr.region,SUM(f.target)AS target FROM dim_region dr,fact_collection_target f ,dim_time dt WHERE dt.month_id=f.month_id AND dr.region_id=f.region_id AND dt.month_id=5 GROUP BY dr.region) AS target WHERE yday.region=mtd.region AND mtd.region=target.region )AS foo;
outputted results as:
"Central";144034;2594109;100394040 "East";144344;2591332;101193660 "North";144910;2591508;100558860 "South";143733;2588980;100303770 "Total";720490;12960302;502879290 "West";143469;2594373;100428960
while he wanted “Total" to be at the end.
so, how to you do it?
there are at least 3 possible choices:
- change “union" to “union all" – union has the feature that is removes duplicate rows. to do so, it has to sort source resultsets – thus killing order. union all doesn't sort as it doesn't remove duplicates – so order will be “as expected"
- you can notice that third column is some kind of summary, so adding “order by 3 asc" would effectively put “total" to the end of results because it has the highest value there:)
- third way involves a small trick. we can sort the results the way we want using functional order by.
first two options are easy. what about a functional order?
to simplify queries i created a table with data as shown by luckymurali_81:
CREATE TABLE lucky ( region text, yday int4, mtd int4, target int4 ); copy lucky FROM stdin; West 143469 2594373 100428960 Central 144034 2594109 100394040 Total 720490 12960302 502879290 South 143733 2588980 100303770 East 144344 2591332 101193660 North 144910 2591508 100558860 \.
now, let's try some “magic":
# SELECT * FROM lucky ORDER BY (CASE WHEN region = 'Total' THEN 1 ELSE 0 END); region | yday | mtd | target ---------+--------+----------+----------- West | 143469 | 2594373 | 100428960 Central | 144034 | 2594109 | 100394040 South | 143733 | 2588980 | 100303770 East | 144344 | 2591332 | 101193660 North | 144910 | 2591508 | 100558860 Total | 720490 | 12960302 | 502879290 (6 ROWS)
nice. but can we do it that regions will be sorted by name, but still keep total “down there"?
sure, we will use a nice addition to postgresql “order by .. nulls first/last":
# SELECT * FROM lucky ORDER BY (CASE WHEN region = 'Total' THEN NULL ELSE region END) ASC NULLS LAST; region | yday | mtd | target ---------+--------+----------+----------- Central | 144034 | 2594109 | 100394040 East | 144344 | 2591332 | 101193660 North | 144910 | 2591508 | 100558860 South | 143733 | 2588980 | 100303770 West | 143469 | 2594373 | 100428960 Total | 720490 | 12960302 | 502879290 (6 ROWS)
of course, this kind of functional order can be used in much more complicated scenarios. but this shows it's usability in quite simple way 🙂
luckymurali_81 suggested another method:
13:29 depesz, thanks
13:30 despesz one more method
13:31 select * from (youroriginalquery) as foo order by (region=’Total’),region
ouch – one bad news – order by “nulls first/last” is 8.3 only.
i’m sorry for inconvenience – i’m using 8.3 on my lap, and didn’t check previous versions.
so, on 8.2 the best way seems to be luckymurali_81 way with order by (condition), field;
thanks depesz, i’m still waiting for 8.3 🙂
@xor:
yeah, i know. sorry – i am so used to using 8.3 that i use 8.3-isms out of habit.
nİCE BLOG. cONGRATS.
thanks.
I actually do the sorting by adding an index column, let’s say:
select ‘W’,10 union select ‘A’,20 union select ‘T’,10+20
I change to:
select col1,col2 from (select ‘W’ as col1,10 as col2,1 as idx union select ‘A’,20,1 union select ‘T’,10+20,2) as t order by idx, col1