On 8th of March, Alvaro Herrera committed patch:
XMLTABLE is defined by the SQL/XML standard as a feature that allows turning XML-formatted data into relational form, so that it can be used as a <table primary> in the FROM clause of a query. This new construct provides significant simplicity and performance benefit for XML data processing; what in a client-side custom implementation was reported to take 20 minutes can be executed in 400ms using XMLTABLE. (The same functionality was said to take 10 seconds using nested PostgreSQL XPath function calls, and 5 seconds using XMLReader under PL/Python). The implemented syntax deviates slightly from what the standard requires. First, the standard indicates that the PASSING clause is optional and that multiple XML input documents may be given to it; we make it mandatory and accept a single document only. Second, we don't currently support a default namespace to be specified. This implementation relies on a new executor node based on a hardcoded method table. (Because the grammar is fixed, there is no extensibility in the current approach; further constructs can be implemented on top of this such as JSON_TABLE, but they require changes to core code.) Author: Pavel Stehule, Álvaro Herrera Extensively reviewed by: Craig Ringer Discussion: https://postgr.es/m/CAFj8pRAgfzMD-LoSmnMGybD0WsEznLHWap8DO79+-GTRAPR4qA@mail.gmail.com
Long story short, it allows for extracting data in table format (rows and columns) directly from XML.
How? Let's see.
First – sample XML:
<people> <person> <first_name>Hubert</first_name> <last_name>Lubaczewski</last_name> <nick>depesz</nick> </person> <person> <first_name>Andrew</first_name> <last_name>Gierth</last_name> <nick>RhodiumToad</nick> </person> <person> <first_name>Devrim</first_name> <last_name>Gündüz</last_name> </person> </people>
For brevity, I will shorten it in queries.
So, let's try to get some data from it:
$ WITH x AS ( SELECT '<people>...</people>'::xml AS source_xml ) SELECT decoded.* FROM x, xmltable( '//people/person' passing source_xml COLUMNS first_name text, last_name text, nick text ) AS decoded first_name | last_name | nick ------------+-------------+------------- Hubert | Lubaczewski | depesz Andrew | Gierth | RhodiumToad Devrim | Gündüz | [NULL] (3 ROWS)
Nice. We can also do some column renaming, like:
$ WITH x AS ( SELECT '<people>...</people>'::xml AS source_xml ) SELECT decoded.* FROM x, xmltable( '//people/person' passing source_xml COLUMNS first_name text, last_name text, nick_name text PATH 'nick' ) AS decoded; first_name | last_name | nick_name ------------+-------------+------------- Hubert | Lubaczewski | depesz Andrew | Gierth | RhodiumToad Devrim | Gündüz | (3 ROWS)
or substitute default values:
$ WITH x AS ( SELECT '<people>...</people>'::xml AS source_xml ) SELECT decoded.* FROM x, xmltable( '//people/person' passing source_xml COLUMNS first_name text, last_name text, nick text DEFAULT '---' ) AS decoded; first_name | last_name | nick ------------+-------------+------------- Hubert | Lubaczewski | depesz Andrew | Gierth | RhodiumToad Devrim | Gündüz | --- (3 ROWS)
you can also add ordinality info, or even use xpath functions:
$ WITH x AS ( SELECT '<people>...</people>'::xml AS source_xml ) SELECT decoded.* FROM x, xmltable( '//people/person' passing source_xml COLUMNS id FOR ordinality, full_name text PATH 'concat(first_name, " ", last_name)', nick text DEFAULT '---' ) AS decoded; id | full_name | nick ----+--------------------+------------- 1 | Hubert Lubaczewski | depesz 2 | Andrew Gierth | RhodiumToad 3 | Devrim Gündüz | --- (3 ROWS)
Full functionality requires some understanding/knowledge of xml and xpath, but it's definitely cool thing to have. Will significantly simplify code required to load data from xml sources.
Thanks, guys.