For a long time I was looking for SQL pretty printer.
Some queries that I had to deal with, over the years, were just insane to read, like this:
SELECT DECODE(a.category,NULL,b.category,a.category) "category",b.par "Total object Request",b.ps "objects Served", b.ar "Total sushi Request", a.sushis "sushis Served", ROUND((DECODE(b.ar,0,0,(b.ar - CAST(DECODE(a.sushis,NULL,0,a.sushis) AS NUMERIC))/b.ar))*100,3) "USR", a.clk "points", DECODE(b.ps,0,0,ROUND((a.clk/b.ps)*100,3)) "CTR",a.cpc "CPC", a.tc "Cost", DECODE(b.ps,0,0,CAST((a.tc/b.ps)*1000 AS NUMERIC(8,3))) "effectcost" FROM (SELECT DECODE(b.category,NULL,'N/A',b.category) category, SUM(doughnuts) sushis, SUM(points) clk, ROUND(CAST(SUM(total_cost) AS NUMERIC),3) tc, DECODE(SUM(points),0,0,ROUND(CAST((SUM(total_cost)/SUM(points)) AS NUMERIC),3)) cpc FROM daily_city_dealer_summary a, category_dealer_map b WHERE a.category_dealer_id=b.category_dealer_id AND created_day BETWEEN '2010-05-01' AND '2010-05-25' GROUP BY b.category) a FULL OUTER JOIN (SELECT DECODE(a.category,NULL,'N/A', DECODE(a.category,'-','World-Remaining countries',a.category)) category, SUM(valid_object_request) par, SUM(valid_sushi_request) ar, SUM(object_doughnuts) ps FROM traffic_hit a WHERE request_date BETWEEN '2010-05-01' AND '2010-05-25' GROUP BY a.category) b ON LOWER(a.category)=LOWER(b.category) ORDER BY 4 DESC;
For years the best (in my opinion) way to format queries was pg_format program from pgFormatter project.
It is really impressive, but has one important drawback – reformatting is done using applying some rather complex regexps, and custom made tokenizers.
I dreamed of something that would actually parse the query, and based on parse tree rebuild the query using some consistent logic.
There is libpg_query library, and based on it, Ruby GEM.
This code is using Pg parser extracted from sources. And parses arbitrary queries to relatively easy to work with tree.
Based on this, I wrote parser microservice, and then pretty printer in perl.
Today I changed explain.depesz.com and paste.depesz.com to use this new code for pretty printing.
If some data will not parse, or my code doesn't support it (yet), both sites will fall back to using pgFormatter.
Additionally, I added a way to pretty print using paste.depesz.com without even saving data on paste.depesz.com site, using /prettify url.
You can use it like this:
=$ echo 'select a from b' | curl -s -XPOST --data-urlencode "q@-" https://paste.depesz.com/prettify SELECT a FROM b; -- Formatted by Pg::SQL::PrettyPrinter
You can write yourself simple script, like /usr/local/bin/prettify.sh with content:
#!/usr/bin/env bash curl -s -XPOST --data-urlencode "q@${1:--}" https://paste.depesz.com/prettify
And then you can use it for files too:
=$ prettify.sh z.sql SELECT 123; -- Formatted by Pg::SQL::PrettyPrinter
All prettified SQLs (via /prettify, or on explain.depesz.com, or on
paste.depesz.com, will contain, in last line, comment that says which prettifying library was used.
So, if your query isn't prettified by Pg::SQL::PrettyPrinter, and you think it should. Or it is, but in incorrect way, please reach to me by filling an issue.
For now only DML queries are handled (select, insert, update, delete), but I will be adding DDL in future.
Hope you'll find it useful.
Also, I'd like to thank Alexandre Felipe (bob) and Andreas ‘ads' Scherbaum for prodding me to get it done. Probably wouldn't have done it without you 🙂
Finally, as a quick example, the query from beginning of post, after formatting with Pg-SQL-PrettyPrinter looks like this:
SELECT decode( a.category, NULL, b.category, a.category ) AS category, b.par AS "Total object Request", b.ps AS "objects Served", b.ar AS "Total sushi Request", a.sushis AS "sushis Served", round( decode( b.ar, 0, 0, ( b.ar - decode( a.sushis, NULL, 0, a.sushis )::pg_catalog.numeric ) / b.ar ) * 100, 3 ) AS "USR", a.clk AS points, decode( b.ps, 0, 0, round( ( a.clk / b.ps ) * 100, 3 ) ) AS "CTR", a.cpc AS "CPC", a.tc AS "Cost", decode( b.ps, 0, 0, ( a.tc / b.ps ) * 1000::pg_catalog.numeric( 8, 3 ) ) AS effectcost FROM ( SELECT decode( b.category, NULL, 'N/A', b.category ) AS category, SUM( doughnuts ) AS sushis, SUM( points ) AS clk, round( SUM( total_cost )::pg_catalog.numeric, 3 ) AS tc, decode( SUM( points ), 0, 0, round( SUM( total_cost ) / SUM( points )::pg_catalog.numeric, 3 ) ) AS cpc FROM daily_city_dealer_summary AS a, category_dealer_map AS b WHERE a.category_dealer_id = b.category_dealer_id AND created_day BETWEEN '2010-05-01' AND '2010-05-25' GROUP BY b.category ) AS a FULL JOIN ( SELECT decode( a.category, NULL, 'N/A', decode( a.category, '-', 'World-Remaining countries', a.category ) ) AS category, SUM( valid_object_request ) AS par, SUM( valid_sushi_request ) AS ar, SUM( object_doughnuts ) AS ps FROM traffic_hit AS a WHERE request_date BETWEEN '2010-05-01' AND '2010-05-25' GROUP BY a.category ) AS b ON LOWER( a.category ) = LOWER( b.category ) ORDER BY 4 DESC; -- Formatted by Pg::SQL::PrettyPrinter
Perhaps not ideal, but it's a change.
I guess formatting is always a matter of personal preference
The case statement I like much better with the new formatter, more on one line that belongs together. However, I personally prefer decode on one line (or what belongs logically together) and not to be scattered among this many lines.
but now I have to search through the pg documentation, I thought decode in this context does not exist in Postgres, only Oracle.
@stefan:
Well, while it is personal preference, the fact that it rebuilds the query, as opposed to reformatting, means that it will be WAY easier to configure it.
Also, I’m not sure what you mean about decode – pg doesn’t have decode function as shown in the example, it must have been added by someone that shared original query.
Hello,
about the formmating of first query, did you used pgFormatter ? What the needed chnages to have the reserved keys in green and the text in red ?
Thanks in advance.
@Hugh24:
Sorry, I don’t quite understand.
pgFormatter adds some whitespace to make the query more readable.
Colors are added by some plugin to wordpress.