On 24th of October 2022, Michael Paquier committed patch:
Add support for regexps on database and user entries in pg_hba.conf As of this commit, any database or user entry beginning with a slash (/) is considered as a regular expression. This is particularly useful for users, as now there is no clean way to match pattern on multiple HBA lines. For example, a user name mapping with a regular expression needs first to match with a HBA line, and we would skip the follow-up HBA entries if the ident regexp does *not* match with what has matched in the HBA line. pg_hba.conf is able to handle multiple databases and roles with a comma-separated list of these, hence individual regular expressions that include commas need to be double-quoted. At authentication time, user and database names are now checked in the following order: - Arbitrary keywords (like "all", the ones beginning by '+' for membership check), that we know will never have a regexp. A fancy case is for physical WAL senders, we *have* to only match "replication" for the database. - Regular expression matching. - Exact match. The previous logic did the same, but without the regexp step. We have discussed as well the possibility to support regexp pattern matching for host names, but these happen to lead to tricky issues based on what I understand, particularly with host entries that have CIDRs. This commit relies heavily on the refactoring done in a903971 and fc579e1, so as the amount of code required to compile and execute regular expressions is now minimal. When parsing pg_hba.conf, all the computed regexps needs to explicitely free()'d, same as pg_ident.conf. Documentation and TAP tests are added to cover this feature, including cases where the regexps use commas (for clarity in the docs, coverage for the parsing logic in the tests). Note that this introduces a breakage with older versions, where a database or user name beginning with a slash are treated as something to check for an equal match. Per discussion, we have discarded this as being much of an issue in practice as it would require a cluster to have database and/or role names that begin with a slash, as well as HBA entries using these. Hence, the consistency gained with regexps in pg_ident.conf is more appealing in the long term. **This compatibility change should be mentioned in the release notes.** Author: Bertrand Drouvot Reviewed-by: Jacob Champion, Tom Lane, Michael Paquier Discussion: https://postgr.es/m/fff0d7c1-8ad4-76a1-9db3-0ab6ec338bf7@amazon.com
The commit message basically explains everything, but let me just quickly talk it through and show some examples.
pg_hba.conf file is used to decide who can connect to which database, from where, and using what authentication method. I wrote about it in more details earlier while I was explaining how to fix some issue.
Now, the format of the file is simple. There are two types of config lines:
- local DATABASE USER METHOD [OPTIONS]
- host* DATABASE USER ADDRESS METHOD [OPTIONS]
This new change modifies what can be but in DATABASE and USER fields.
To recap, in DATABASE you could have any number of database names, separated by , (comma).
So, for example you could have line line:
local db1,db2,postgres all trust
which would allow anyone to connect to one of the 3 listed databases without any kind of verification of who is who.
In case of USER field, you had more options:
- username
- group name (role) prefixed with + sign
- name of file that contains list of username, prefixed with @
and, of course, you can have any number of these elements, separated by comma.
So, for example:
local db1 depesz,postgres,+dbas,@/etc/postgres/cron.users.lst trust
would allow connection to db1 by users depesz, postgres, any user belonging to group dbas, and any user listed in /etc/postgres/cron.users.lst file.
Now, in both of these fields (DATABASE and USER), we can add regexps, by prefixing them with / (slash).
How would that look? For example like this:
local /^db[0-9]+$,/^depesz depesz trust
would let me connect to two groups of databases:
^db[0-9]+$
– any db that has name starting with “db" and then there are some (at least 1) digits. And then end of string. So, for example: db1, db13123, but not db23b^depesz
– any db that has name starting with depesz
Similarly I could use it in USER field, for example like this:
host all /-rw$ 10.1.2.0/24 scram-sha-256 host all /-ro$ 10.0.0.0/8 scram-sha-256
This would allow any user with name ending with -rw to login, as long as they connect from ip being one of 10.1.2.*, and any user with name ending with -ro from ip that is one of 10.*.*.*.
I think it looks great, and will allow nicer configuration in some cases. Well, as long as you don't mind regexps. But if you'll not go overboard with complexity, it should be really, really helpful thing.
As always, thanks to all involved.