On 27th of February, Heikki Linnakangas committed patch:
Add support for piping COPY to/from an external program. This includes backend "COPY TO/FROM PROGRAM '...'" syntax, and corresponding psql \copy syntax. Like with reading/writing files, the backend version is superuser-only, and in the psql version, the program is run in the client. In the passing, the psql \copy STDIN/STDOUT syntax is subtly changed: if you the stdin/stdout is quoted, it's now interpreted as a filename. For example, "\copy foo from 'stdin'" now reads from a file called 'stdin', not from standard input. Before this, there was no way to specify a filename called stdin, stdout, pstdin or pstdout. This creates a new function in pgport, wait_result_to_str(), which can be used to convert the exit status of a process, as returned by wait(3), to a human-readable string. Etsuro Fujita, reviewed by Amit Kapila.
The commit message seems to be self explanatory, but let's see how it works, and what it can be used for.
First example, copying data to a compressed file:
$ copy pg_class TO program 'gzip -9c - > /tmp/copy.data.gz'; COPY 296
And, it works:
=$ ls -l /tmp/copy.data.gz -rw------- 1 pgdba pgdba 4293 Feb 28 14:15 /tmp/copy.data.gz =$ zcat /tmp/copy.data.gz | head pg_statistic 11 10817 0 10 0 11797 0 15 386 15 2840 0 t f p r 26 0 f f f f f 712 1 {pgdba=arwdDxt/pgdba} \N pg_type 11 71 0 10 0 0 0 8 332 8 0 0 t f p r 30 0 t f f f f 712 1 {=r/pgdba} \N sys_variables 2200 16409 0 16385 0 16407 0 1 2 0 16410 0 t f p r 4 0 f f f f f 765 1 \N \N pg_toast_2619 99 11049 0 10 0 11799 0 2 10 2 0 2841 t f p t 3 0 f f f f f 712 1 \N \N pg_toast_2619_index 99 0 0 10 403 11801 0 2 10 0 0 0 f f p i 2 0 f f f f f 0 0 \N \N pg_authid_rolname_index 11 0 0 10 403 0 1664 2 1 0 0 0 f t p i 1 0 f f f f f 0 0 \N \N pg_authid_oid_index 11 0 0 10 403 0 1664 2 1 0 0 0 f t p i 1 0 f f f f f 0 0 \N \N pg_attribute_relid_attnam_index 11 0 0 10 403 0 0 16 2265 0 0 0 f f p i 2 0 f f f f f 0 0 \N \N pg_attribute_relid_attnum_index 11 0 0 10 403 0 0 11 2265 0 0 0 f f p i 2 0 f f f f f 0 0 \N \N pg_toast_1255 99 11046 0 10 0 0 0 0 0 0 0 2837 t f p t 3 0 f f f f f 712 1 \N \N
Nice. Of course, the way I ran it – i.e. COPY sql command, it is being executed on database server, using the shell account that Postgres daemon is running from. Thanks to \copy in psql, we can use it for local program access, like this:
$ CREATE TABLE z (i int4); CREATE TABLE $ \copy z FROM program 'perl -le "print int(1000 * rand()) for 1..10"' $ SELECT * FROM z; i ----- 21 817 526 892 349 332 21 248 167 275 (10 ROWS)
Of course, using it to generate random data is pretty much pointless, but I can imagine using it for loading data that is stored on remote server, and thus bypassing need to download it and store locally. Something along the lines of:
$ CREATE TABLE words (id int4 PRIMARY KEY, word text); CREATE TABLE $ \copy words FROM program 'wget -q -O - https://www.depesz.com/various/words.csv' $ SELECT COUNT(*) FROM words; COUNT ------- 72188 (1 ROW)
With larger datasets it will be better to run it using “COPY" and not “\copy" – to decrease number of times data has to be transferred.
It's actually pretty cool thing, thanks Etsuro.