my thoughts on getting random row

this topic has been written about by many smart people – from the recent past, by greg sabino mullane and josh berkus.

they show 4 different approaches:

  1. order by random()
  2. >= random() limit 1
  3. random column
  4. random aggregate

all these approaches have their benefits and drawbacks, but i'd like to show another one (polish readers saw the approach already in january 2007, but this time i will make the code more robust).

Continue reading my thoughts on getting random row

speeding up like ‘%xxx%’

as most of you know postgresql can easily speedup searches using:

field like 'something%'

and (less easily):

field like '%something'

but how about:

field like '%something%'

general idea is to use some kind of full text search/indexing – tsearch, lucene, sphinx, you name it.

but sometimes you can't install fts/fti, or it doesn't really solve your problem. is there any help? let's find out.

Continue reading speeding up like ‘%xxx%'

objects in categories – counters with triggers

so there you go, you have some “categories" and some objects. for simplicity let's assume one object can be in only one category.

if this is too theoretical for you – let's assume these are “mails in folders", “photos in galleries", “posts in categories" or “auctions in categories on ebay".

everything clear? now, let's assume you want to know how many “objects" are in given “category".

most basic way to do it is:

SELECT COUNT(*) FROM objects WHERE category = some_category;

but this method is far from optimal. now, we'll learn how to do it better.

one warning for those of you who read the rss feed – if you say “yeah, i know the code, it's simple" – ask yourself – is your code deadlock-proof?

Continue reading objects in categories – counters with triggers

avoiding “empty” updates

just recently i saw a very interesting situation.

one table in customer database (with about 1.5 milion rows) was *very often* updated. using series of simple:

UPDATE TABLE SET FIELD = .., WHERE id = ...

updates always updated 1 record, search was using primary key, so it was quite fast.

what was strange was the fact that the table get about 20 times more updates then the next most updated table.

after careful checking i found out that a lot (about 60-70%) of the updates actually didn't change anything!

they were simply setting values that were already there.

so, i started to think about hwo to avoid this kind of situation.

and this is what i found out.

Continue reading avoiding “empty" updates

(postgre)sql gotchas

i would like to write about a things that are not really errors by themselves – they are simply situations when database acts in undesired way. a way that's perfectly logical, but where the logic is not always clearly seen “at first sight".

so, don't expect any postgresql-bashing. if you want some, indicate so in comments, i'll find some things to bash postgresql for.

but in here i'd rather bash us – people – for making mistakes. the ones that are not really easy to see as mistakes for database.

Continue reading (postgre)sql gotchas

indexable “field like ‘%something'” – update

thomas reiss just published his follow up to my post about indexed searches in “like ‘%something'" cases.

i don't understand french so i can't tell what exactly he wrote, but i understand c code and graphs 🙂

basically – he wrote c function that does reverse() of strings, and benchmarked it against pl/pgsql and pl/perl versions. of course c version is the fastest, but just take a look at how much faster it is. and how simple it is.

how many transactions per second?

i wanted to know how many transactions per second is my machine processing.

how to do so? a simple select to pg_stat_database will do the job (actually 2 selects 🙂

but since i have to write it anyway, perhaps i can/should make it so it will print the current value continuously?

and, while i'm at it, some kind of graph wouldn't be bad 🙂

Continue reading how many transactions per second?

better results paging in postgresql 8.2

some time ago merlin moncure wrote about one of new features of postgresql 8.2 – row-wise comparison.

i also read about it, but at first didn't find anything very useful about it – after all it doesn't give you any new functionality. any kind “(a, b, c) > (…)" can be written with standard column-based operators. so it's basically just a syntactic-sugar.

that's true. but merlin pointed me to not-so-obvious benefit – basically it allows much better paging of results. how? let's see.

Continue reading better results paging in postgresql 8.2