How much RAM is PostgreSQL using?

(disclaimer: all the data and examples in here are on Linux – the same data can be probably obtained on other systems too, it's just that I work on Linux and don't know other systems well).

This question pops occasionally in various places – PostgreSQL is using too much memory, why is that, and how can it be mitigated?

Before we can go to “optimizing", we should understand the problem. But do we? Both standard tools – ps and top – lie. How/why? Let's see.

Continue reading How much RAM is PostgreSQL using?

“= 123” vs. “= ‘depesz'”. What is faster?

There is this idea that normal form in databases require you to use integer, auto incrementing, primary keys.

The idea was discussed by many people, I will just point you to series of three blog posts on the subject by Josh Berkus ( part 1, 2 and 3, and reprise).

One of the points that proponents of surrogate keys (i.e. those based on integer and sequences) raise is that comparing integers is faster than comparing texts. So,

SELECT * FROM users WHERE id = 123

is faster than

SELECT * FROM users WHERE username = 'depesz'

Is it?

Continue reading “= 123" vs. “= ‘depesz'". What is faster?

OmniPITR 0.7.0

Just released new version of OmniPITR.

This version has one important new feature: when you're calling omnipitr-backup-slave, it will make backups only of required xlog files, and not, as previously, of all in walarchive directory.

This is important, especially in case you have multiple slaves, or you keep shared long-term walarchive. Previously – backups would get all files from walarchive (-s option to omnipitr-backup-slave), but now, it picks just the ones that are needed.

On somehow related note – I will be working now, finally, to get omnipitr-monitor functionality working.

OmniPITR 0.6.0

Just released new version, 0.6.0 (it should be visible on pgxn soon) of OmniPITR set of tools.

New version has one new feature – parallelism.

This works in omnipitr-archive and omnipitr-backup-* programs, and allows for parallel delivery to remote destinations (multiple -dr switches).

Also – if you're using compresses wal archive and omnipitr-backup-slave reading from it – all the wal files have to be decompressed before making backup – and this decompression can be parallelized too.

All parallelization is controled using -PJ option (–parallel-jobs), so you can add “-PJ 10" to get up to 10 decompressions at the same time or up to 10 deliveries at the same time.

Let’s talk dirty

Important disclaimer: the module that I'm writing about was written by my colleague Phil Sorber.

We all have been in, or heard about, situation like this:

$ UPDATE users SET password = '...'; WHERE id = 123;

(hint: first ; is before where).

Of course you should have backups, and you can protect yourself from it. But what if backup is too old, and you didn't protect yourself?

Continue reading Let's talk dirty

Szukam webmastera [polish only]

Ostatnio pisałem o moim pierwszym, malutkim, projekciku w Pythonie – kanasta.depesz.com.

Całość działa, ale – mimo, że nie spodziewam się mieć tam użytkowników (tzn. innych niż nasza czwórka), chciałbym by to wyglądało, i może było odrobinę prostsze w użyciu.

Szukam więc webmastera. Czego oczekuję? Rzeczony webmaster …

  • pobierze sobie z githuba soft, postawi go u siebie (do celów testowych, ja mam gdzie hostować) – potrzebne – postgres, python, flask, psycopg2, matplotlib. W razie gdyby się okazało, że to olbrzymi problem – dogadamy się.
  • w oparciu o to co widzi, zaproponuje wygląd (jpgi z wyglądem)
  • wymieni ze mną kilka(naście) maili nt. propozycji wyglądu i moich “przemyśleń"
  • przekonwertuje jpg'i do postaci szablonów jinja2, cssów, js'ów
  • doda logikę w js'ach do walidacji i automatyzacji pewnych operacji (głównie rejestracja nowych partii)
  • dostarczy zmiany w dowolny sposób (pull request, mail, gołąb pocztowy)
  • wystawi na całość fakturę

Chętna/chętny? Napisz. Aby móc wybrać potrzebowałbym tylko wstępnej wyceny (koszt i czas). Przesłanie obrazka z wstępną wizualizacją bardzo pomoże.

Nie wiem ile jestem w stanie zapłacić. Tzn. nie wiem ile takie coś może kosztować, a nie chcę zastrzegać, że zapłacę “x" bo może się okazać, że moje “x" jest np. o dwa rzędy wielkości za wysokie jak na taki mały projekcik.

Praca nie jest “na tempo, bo jutro użytkownicy na to wchodzą". To jest małe, spokojne, luźne zleconko które pewnie można zrobić w godzinę.

Request for help with Python/Flask – Prośba o pomoc z Pythonem/Flaskiem

( wersja polska poniżej )

I (not-so) recently started to learn Python. To have some playground that I can work on, I decided to write simple website that will let me track scores of a card game that I play with my family – Canasta.

To write it, I chose to use Flask framework, so I learned at the same time both Python and Flask.

Final result (without layout, just functionality) is on github.

If any of you does write Python and/or Flask, I would greatly appreciate all comments. Even the harsh ones. If anything is wrong, or simply not really good – let me know – I'm treating it as a way to learn so all feedback would be good.

Just a word of warning – if you'll decide to look at it – you will be dealing with very bad Python code. Brace yourself.


Niedawno zacząłem uczyć się Pythona. Uczę się najlepiej robiąc coś, więc stwierdziłem, że zrobię prosty site do śledzenia wyników gry w którą gram z rodziną – kanasty.

Zdecydowałem, że użyję Flask'a – dzięki czemu uczyłem się jednocześnie i języka (Python) i frameworka (Flask).

Działająca wersja (bez wyglądu, sama funkcjonalność!) jest na githubie.

Jeśli znasz Pythona i/lub Flaska, byłbym bardzo wdzięczny za przejrzenie tego kodu i skomentowanie. Nawet zbluzganie. Jeśli cokolwiek jest źle, lub nie-za-dobrze, daj mi znać. To (ten soft) to dla mnie metoda na nauczenie się, więc każdy komentarz jest mile widziany.

Słówko ostrzeżenia jedynie: jeśli się zdecydujesz na to spojrzeć, miej świadomość, że to będzie bardzo zły kod w Pythonie. Tragiczny. Bądź gotów.

Waiting for 9.2 – pg_stat_statements improvements

Three interesting patches:

  • On 27th of March, Robert Haas committed patch:
    New GUC, track_iotiming, to track I/O timings.
     
    Currently, the only way to see the numbers this gathers is via
    EXPLAIN (ANALYZE, BUFFERS), but the plan is to add visibility through
    the stats collector and pg_stat_statements in subsequent patches.
     
    Ants Aasma, reviewed by Greg Smith, with some further changes by me.
  • On 27th of March, Robert Haas committed patch:
    Expose track_iotiming information via pg_stat_statements.
     
    Ants Aasma, reviewed by Greg Smith, with very minor tweaks by me.
  • On 29th of March, Tom Lane committed patch:

    Improve contrib/pg_stat_statements to lump "similar" queries together.
     
    pg_stat_statements now hashes selected fields of the analyzed parse tree
    to assign a "fingerprint" to each query, and groups all queries with the
    same fingerprint into a single entry in the pg_stat_statements view.
    In practice it is expected that queries with the same fingerprint will be
    equivalent except for values of literal constants.  To make the display
    more useful, such constants are replaced by "?" in the displayed query
    strings.
     
    This mechanism currently supports only optimizable queries (SELECT,
    INSERT, UPDATE, DELETE).  Utility commands are still matched on the
    basis of their literal query strings.
     
    There remain some open questions about how to deal with utility statements
    that contain optimizable queries (such as EXPLAIN and SELECT INTO) and how
    to deal with expiring speculative hashtable entries that are made to save
    the normalized form of a query string.  However, fixing these issues should
    require only localized changes, and since there are other open patches
    involving contrib/pg_stat_statements, it seems best to go ahead and commit
    what we've got.
     
    Peter Geoghegan, reviewed by Daniel Farina

Continue reading Waiting for 9.2 – pg_stat_statements improvements