Before I'll start let me say that I am fan of what Oleg and Teodor did – their work is great, and I do appreciate their time and ideas.
But – I simply don't like the idea of using FTS (Full Text Search) inside of database. Why? Let me show you.
Before I'll start let me say that I am fan of what Oleg and Teodor did – their work is great, and I do appreciate their time and ideas.
But – I simply don't like the idea of using FTS (Full Text Search) inside of database. Why? Let me show you.
Foreign keys are known for couple of things, but speeding up your system is not one of them. But sometimes, having them in place lets you make queries significantly faster.
How? Let me show you example I have seen lately (well, it's simplified example based on something much more convoluted, and definitely longer):
Today, Mattias|farm on IRC asked how to create primary key using HASH index. After some talk, he said that in some books it said that for “=" (equality) hash indexes are better.
So, I digged a bit deeper.
One database that I am monitoring uses a lot of stored procedures. Some of them are fast, some of them are not so fast. I thought – is there a sensible way to diagnose which part of stored procedure take the most time?
I mean – I could just put the logic into application, and then every query would have it's own timing in Pg logs, but this is not practical. And I also believe that using stored procedures/functions is way better than using plain SQL due to a number of reasons.
So, I'm back to question – how to check which part of function takes most of the time?
On 3rd of August, Tatsuo Ishii committed patch by ITAGAKI Takahiro:
Log Message: ----------- Multi-threaded version of pgbench contributed by ITAGAKI Takahiro, reviewed by Greg Smith and Josh Williams. Following is the proposal from ITAGAKI Takahiro: Pgbench is a famous tool to measure postgres performance, but nowadays it does not work well because it cannot use multiple CPUs. On the other hand, postgres server can use CPUs very well, so the bottle-neck of workload is *in pgbench*. Multi-threading would be a solution. The attached patch adds -j (number of jobs) option to pgbench. If the value N is greater than 1, pgbench runs with N threads. Connections are equally-divided into them (ex. -c64 -j4 => 4 threads with 16 connections each). It can run on POSIX platforms with pthread and on Windows with win32 threads. Here are results of multi-threaded pgbench runs on Fedora 11 with intel core i7 (8 logical cores = 4 physical cores * HT). -j8 (8 threads) was the best and the tps is 4.5 times of -j1, that is a traditional result. $ pgbench -i -s10 $ pgbench -n -S -c64 -j1 => tps = 11600.158593 $ pgbench -n -S -c64 -j2 => tps = 17947.100954 $ pgbench -n -S -c64 -j4 => tps = 26571.124001 $ pgbench -n -S -c64 -j8 => tps = 52725.470403 $ pgbench -n -S -c64 -j16 => tps = 38976.675319 $ pgbench -n -S -c64 -j32 => tps = 28998.499601 $ pgbench -n -S -c64 -j64 => tps = 26701.877815 Is it acceptable to use pthread in contrib module? If ok, I will add the patch to the next commitfest.
On 3rd of November Andrew Dunstan committed his patch which adds new function to PostgreSQL – suppress_redundant_updates_trigger().
This function is not for using in selects, but it can help you tremendously if your database access matches certain pattern.
Continue reading Waiting for 8.4 – suppress_redundant_updates_trigger
When I was working for one of customers we found some strange thing. We needed to found number of distinct sessions per day. Table layout was very simple:
as you perhaps know there is this site/blog called high scalability. it contains articles about various things related to performance, scalability, availability and so on.
usually, when there is something about databases, it is about mysql. luckily for us, today they showed something about postgresql: