I once wrote about this problem, but given that we now have DO blocks and procedures I can make nicer, easy to use example.
Continue reading How to ALTER tables without breaking application?
I once wrote about this problem, but given that we now have DO blocks and procedures I can make nicer, easy to use example.
Continue reading How to ALTER tables without breaking application?
Every now and then I see something like this:
SELECT u.* FROM users u WHERE 0 = (SELECT COUNT(*) FROM addresses a WHERE a.user_id = u.id);
and it kinda pains me. So figured, I'll write about what is the problem with it, and how to avoid such constructs.
Continue reading SQL best practices – don't compare count(*) with 0
This question happens every now and then in one of PostgreSQL support places.
Whenever it happens, I just suggest to not try, as interacting with outside world from database can be problematic, and instead use LISTEN/NOTIFY.
But it occurred to me, that while I know how to do it, I don't think I actually did it. It being: handle listen/notify in real life code. So let's try. I will not be writing actual email sending or http requesting, but will make sure that my program will get, from database, information when NOTIFY happens. How hard could it be?
Continue reading How can I send mail or HTTP request from database?
Was asked recently about optimization of interesting case. There was table like:
=$ CREATE TABLE input_data ( category_id INT8, object_id INT8, interaction_ts timestamptz, interaction_type TEXT, interaction_count INT4 );
And there was a code that was grouping it all by sum()ing interaction_count per category, object, interaction_type, and timestamp truncated to hour.
Basically, storing somewhere result of:
=$ SELECT category_id, object_id, date_trunc( 'hour', interaction_ts ) AS ts, SUM(interaction_count) FILTER (WHERE interaction_type = 'a') AS a_count, SUM(interaction_count) FILTER (WHERE interaction_type = 'b') AS b_count FROM input_data GROUP BY 1, 2, 3;
While talking about optimizations, one idea that came was to store whole day of counts in single row, as array. So the resulting count table would be:
=$ CREATE TABLE results ( category_id int8, object_id int8 interaction_day DATE, a_counts int4[], b_counts int4[] );
Where a_counts, and b_counts would always have 24 elements, one for each hour.
Now, how to roll it up like this?
Continue reading Grouping data into array of sums – fun with custom aggregates
We recently hit an interesting case where planner picked wrong plan. Kinda. And figured it could be used to show how to deal with such cases.
Continue reading Case study: optimization of weirdly picked bad plan
We had a case recently where one of our DBs failed over to a new primary. To this server (old primary, and then new primary) we had connection from some kind of CDC tool (Debezium, I think).
The thing is that while there was failover, this tool read (using logical decoding) changes on old primary to location 1F1F/4ADC3348, and on new db, it connected from location 1F1F/4ADC4038. Which means that there was some WAL that was not decoded and sent to Debezium.
This is fixable, but we need to know what tables were touched in the WAL stream, that Debezium didn't see, if any. Perhaps we could also get ids/location of the rows that were there?
Continue reading What tables were touched within given range of wal LSN?
Recently someone asked on Slack about what is transaction wraparound. Full answer is a bit too much for slack reply, but I can try to explain it in here.
psql, the database client for PostgreSQL has, since forever, support for variables.
These let you write certain queries in a way that is safe even when getting params from “outside".
Let's see what can be done with it…
Lately in couple of places I recommended people that they can solve their problem with queries using LATERAL. In some cases recipient of such suggestion indicated that they had no idea what LATERAL is. Which made me think that it might be good idea to write more about them (lateral queries)…
Also – I know that some of the examples I shown in here can be done differently, I just wanted to show how one can use LATERAL, and am terrible with coming up with better usecases.
Continue reading What is LATERAL, what is it for, and how can one use it?
Just so that it will be perfectly clear: the logs I have in mind are the ones for DBAs to read – with slow queries, errors, and other interesting information.
So, how does one find them?