psql, the database client for PostgreSQL has, since forever, support for variables.
These let you write certain queries in a way that is safe even when getting params from “outside".
Let's see what can be done with it…
To use them, we first need to know how to set them. For this, we have multiple apporaches:
- you can pass variables as arguments to psql: psql -v var=value (instead of -v x=y, you can use –set=x=y or –variable=x=y)
- you can set them using \set metacommand
- you can also set them to values from a query, using \gset, optionally with prefix
To use them, you generally use :var query or metacommand. But sometimes you might need to do something else to it. We'll see in a moment.
So, off to examples. Simplest thing you can do is:
=$ psql -X -v a=b psql (16beta1) TYPE "help" FOR help. =# \echo VALUE OF var-a IS :a VALUE OF var-a IS b
That was simple. Of course if you'd want to pass value with complicated “things" inside, you have to take into consideration quoting things in your shell. For example, if I'd want to pass hubert depesz lubaczewski as variable, it will not really work easily because:
=$ psql -X -v a=hubert depesz lubaczewski psql: error: connection TO server ON socket "/tmp/.s.PGSQL.5430" failed: FATAL: ROLE "lubaczewski" does NOT exist
This is simply because space separated params. And to pass the whole string as value, I need to quote it. For example like this:
=$ psql -X -v a='hubert depesz lubaczewski' psql (16beta1) TYPE "help" FOR help. =# \echo :a hubert depesz lubaczewski
< For even more complicated values (for example, containing both “ and ‘ characters) you might need to put the value in file, and load it, for example using something like:
=$ psql -X -v a="$( cat file.txt )"
Now, let's use the variables for queries.
Let's assume I want to select top-n rows from t table, where both n and t are given as variables.
So, I can imagine having sql file, named show.sql that contains:
SELECT * FROM :t LIMIT :n;
=$ psql -X -v n=5 -v t=pg_depend -f SHOW.sql classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype ---------+-------+----------+------------+----------+-------------+--------- 1247 | 12002 | 0 | 1259 | 12000 | 0 | i 1247 | 12001 | 0 | 1247 | 12002 | 0 | i 2618 | 12003 | 0 | 1259 | 12000 | 0 | i 1247 | 12007 | 0 | 1259 | 12005 | 0 | i 1247 | 12006 | 0 | 1247 | 12007 | 0 | i (5 ROWS)
this, theoretically is great. There are problems, though.
Basically, the values from variables are directly inlined to query. So, whatever I will put there, will be passed to query. Directly, without any change.
Which means I can use this for sql-injection:
=$ psql -X -v n="1; select version()" -v t=pg_depend -f SHOW.sql classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype ---------+-------+----------+------------+----------+-------------+--------- 1247 | 12002 | 0 | 1259 | 12000 | 0 | i (1 ROW) version ---------------------------------------------------------------------------------------------- PostgreSQL 16beta1 ON x86_64-pc-linux-gnu, compiled BY gcc (Debian 12.2.0-14) 12.2.0, 64-bit (1 ROW)
This is first problem. Second is weirder – with such approach you can't select data from some tables.
For example, let's make table:
=$ CREATE TABLE "do it" AS SELECT generate_series(1,5) i;
If I'd run my show.sql with t="do it", I will get:
=$ psql -X -v n="1" -v t="do it" -f SHOW.sql psql:SHOW.sql:1: ERROR: syntax error at OR near "do" LINE 1: SELECT * FROM do it LIMIT 1; ^
This is because, again, the values got directly inlined. What can be done about it? Use proper quotes.
Do do it you use one of syntaxes:
- :'variable'
- :"variable"
First will put the value in query, but quoting it as if was supposed to be normal literal value. Even if the value already contains ‘ characters!
Second syntax will quote the variable as if it was means to be identifier.
Knowing this I can now change my show.sql to be:
SELECT * FROM :"t" LIMIT :'n';
And now I both can select from do it table:
=$ psql -X -v n="1" -v t="do it" -f SHOW.sql i --- 1 (1 ROW)
And I can't do sql-injection:
=$ psql -X -v n="1; select version()" -v t=pg_depend -f SHOW.sql psql:SHOW.sql:1: ERROR: invalid INPUT syntax FOR TYPE BIGINT: "1; select version()" LINE 1: SELECT * FROM "pg_depend" LIMIT '1; select version()'; ^
In the beginning I mentioned that you can use \gset. This is amazing and very useful.
Basically you can get any query, and put output from it into variables. For example:
=# SELECT now(), version() \gset =# \echo now IS [:now], AND pg version IS :version now IS [2023-05-28 15:14:53.9115+02], AND pg version IS PostgreSQL 16beta1 ON x86_64-pc-linux-gnu, compiled BY gcc (Debian 12.2.0-14) 12.2.0, 64-bit
\gset gets optional param, which will be prefix to names of variables that will be generated:
=# SELECT * FROM pg_am LIMIT 1; oid | amname | amhandler | amtype -----+--------+----------------------+-------- 2 | heap | heap_tableam_handler | t (1 ROW) =# SELECT * FROM pg_am LIMIT 1 \gset xxx_ =# \echo :xxx_amhandler heap_tableam_handler
This is helpful if you want to get results from many queries and store them as variables, so you can always prefix their names not to have conflicts.
What happens, though, if you'd write a query that returns more than 1 row?
=# SELECT * FROM pg_am LIMIT 2 \gset am_ more than one ROW returned FOR \gset
Sweet. It will immediately fail, so you know what's going on.
Using variables, you can easily load larger value from file, and use it as normal text value, even if it has multiple lines, or weird characters:
=# \SET SHOW `cat show.sql` =# SELECT :'show'; ?COLUMN? -------------------------------- SELECT * FROM :"t" LIMIT :'n'; (1 ROW)
Couple of times I got questions from people on IRC, Slack, or Discord, how to use these variables in functions. As in SQL, or pl/PgSQL functions or DO blocks.
These people were trying to do stuff like:
CREATE OR REPLACE FUNCTION testit() RETURNS INT8 AS $$ DECLARE v_int INT4; BEGIN SELECT COUNT(*) INTO v_int FROM pg_class WHERE relkind = :'kind'; RETURN v_int; END; $$ LANGUAGE plpgsql;
but this fails immediately with:
ERROR: syntax error at OR near ":" LINE 5: ...T COUNT(*) INTO v_int FROM pg_class WHERE relkind = :'kind'; ^
The thing is, that psql processes variables before passing query to postgresql, and it can't just replace parts of strings – whole function body is a string, so it can't be modified by value substitution.
The way to solve it is to parametrize the function:
=# CREATE OR REPLACE FUNCTION testit(IN p_relkind TEXT) RETURNS INT8 AS $$ DECLARE v_int INT4; BEGIN SELECT COUNT(*) INTO v_int FROM pg_class WHERE relkind = p_relkind; RETURN v_int; END; $$ LANGUAGE plpgsql; CREATE FUNCTION $ \SET relkind r $ SELECT testit(:'relkind'); testit ──────── 74 (1 ROW)
In case of DO blocks, it's a bit more complicated, as they don't have parameters. In such case, I could (ab)use configuration system using:
=# SET depesz.some_name = :'relkind';
and then in DO block, use current_setting(‘depesz.some_name').
One note, though, you can't use psql variables within psql -c …. If you'd try, you will get error:
=$ psql -X -v a=b -c "select :'a'" ERROR: syntax error at OR near ":" LINE 1: SELECT :'a' ^
If you want to pass variable to simple thing, just use redirection:
=$ psql -X -v a=b <<< "select :'a'" ?COLUMN? ---------- b (1 ROW)
or file, as I did many times in examples above, like: psql -X -v n=5 -v t=pg_depend -f show.sql.
Finally, one hint. While you can use the as parts of queries, you can also use them instead of queries.
For example, I have in my ~/.psqlrc this, somewhat long line:
\SET replag 'SELECT format(''%s : Last TXN: %s, lag: %s'', now()::timestamp(0), pg_last_xact_replay_timestamp()::timestamp(0), (now() - pg_last_xact_replay_timestamp())::INTERVAL(3));'
thanks to this, in psql, I can simply type :replag and get replication lag from current server:
=$ :replag format ════════════════════════════════════════════════════════════════════════ 2023-05-28 13:49:59 : LAST TXN: 2023-05-28 13:49:59, lag: 00:00:00.064 (1 ROW)
Hope it helps, and will clear some misconceptions about what these are, and what they can be used for 🙂
Also good to note
(i bang my head for that this week)
it doesn’t work with inline commands (-c)
@eMerzh:
Not sure what you mean. I wrote about it already in the paragraph starts with “One note, though, you …”.
At the time you were commenting it was starting with “Finally, one note”, but while checking if I did write about it, I found that there are two “finally” sections, so I changed the one about -c.
As a last resort, in case you can’t turn your psql vars into function parameters or config vas, one can use the following construction:
“`
select format(
$gexec$
…code where %s, etc is replaced by the psql vars …
$gexec$,
:’psql_var’, … ) \gexec
“`
Thank you. I’d been looking at how to pull ${HOSTNAME%%.*} in as I don’t like %m or %M in my prompt. “[local]” is too vague when you have more than one server. But now I can pull my hostname in!
\set host `printf “${HOSTNAME%%.*}”`
Since PostgreSQL 14, you can use BEGIN ATOMIC which treats the function body as SQL code, not a string literal, enabling the kind of variable substitution discussed in your article.
@Malthe:
this will not be evaluated/substituted at runtime anyway, so not entirely sure what good would come out of it.
I have been using variables in my query en psql. But I get a problem if I try to substitute :ejer00.
If I \set ejer 23, I would like to get 2300. But it does not make the substitution.
Could you help me?
@Pablo:
sure. because :ejer00 is perfectly viable variable name. so how would psql know if you want:
– value of “e” and added string jer00
– value of “ej” and added string er00
…
Show me your usecase – generally how to do what you want will depend on what *exactly* you want. For example, perhaps this is the solution: