Support parallel btree index builds. To make this work, tuplesort.c and logtape.c must also support parallelism, so this patch adds that infrastructure and then applies it to the particular case of parallel btree index builds. Testing to date shows that this can often be 2-3x faster than a serial index build. The model for deciding how many workers to use is fairly primitive at present, but it's better than not having the feature. We can refine it as we get more experience. Peter Geoghegan with some help from Rushabh Lathia. While Heikki Linnakangas is not an author of this patch, he wrote other patches without which this feature would not have been possible, and therefore the release notes should possibly credit him as an author of this feature. Reviewed by Claudio Freire, Heikki Linnakangas, Thomas Munro, Tels, Amit Kapila, me. Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
This is really, really sweet.
Let's see how it works.
Will need, of course, some table with data:
=$ CREATE TABLE test AS SELECT generate_series(1,100000000) AS id;
Now, I can make an index on the table, on id column. To have some sensible comparison, I'll run it three times.
First, without parallelization:
=$ SET maintenance_work_mem = '1GB'; =$ \timing =$ CREATE INDEX q ON test (id); TIME: 25294.185 ms (00:25.294) =$ DROP INDEX q; TIME: 167.724 ms =$ CREATE INDEX q ON test (id); TIME: 25538.743 ms (00:25.539) =$ DROP INDEX q; TIME: 165.651 ms =$ CREATE INDEX q ON test (id); TIME: 25129.483 ms (00:25.129) =$ DROP INDEX q; TIME: 169.231 ms
and now, let's try with 8-way parallel work:
=$ SET maintenance_work_mem = '1GB'; =$ SET max_parallel_workers = 16; =$ SET max_parallel_maintenance_workers = 8; =$ \timing =$ CREATE INDEX q ON test (id); TIME: 16922.233 ms (00:16.922) =$ DROP INDEX q; TIME: 163.034 ms =$ CREATE INDEX q ON test (id); TIME: 16958.423 ms (00:16.958) =$ DROP INDEX q; TIME: 165.781 ms =$ CREATE INDEX q ON test (id); TIME: 16484.234 ms (00:16.484) =$ DROP INDEX q; TIME: 166.140 ms
So, it looks like non-parallel create index took around 25 seconds, and parallel one – ~ 16.5. Performance improvement is “8 times faster", but it's clearly there.
Out of curiosity, I tested the speed with max_parallel_maintenance_workers set from 0 to 8. Results (only fastest time reported):
max_parallel_maintenance_workers | index creation time (milliseconds) |
---|---|
0 | 25001.376 |
1 | 20440.343 |
2 | 17927.330 |
3 | 17493.936 |
4 | 18911.540 |
5 | 19349.886 |
6 | 18859.417 |
7 | 18794.899 |
8 | 19017.379 |
In any way – it's a great addition, one that will make my life, as DBA, a bit easier. Thanks to all involved.
Update:
Based on comment from Peter Geoghegan I did one more test. This time my id column was textual, and contained random data:
- 10 million rows
- each row contained one word
- each word contained 5-14 characters
- each character was one of: (“a".."z", “A".."Z", “0".."9″)
- table was 400MB in size
As previously, I ran create index three times with varying max_parallel_maintenance_workers. Results:
max_parallel_maintenance_workers | index creation time (milliseconds) |
---|---|
0 | 24632.961 |
1 | 13342.488 |
2 | 10393.877 |
3 | 9231.971 |
4 | 9760.243 |
5 | 9247.169 |
6 | 9347.192 |
7 | 9588.541 |
8 | 9132.571 |
This time the fastest was the run with 8 workers, using almost 2.7 times less time than without parallelization.
Thanks Peter for the hint.
You’d probably have been able to show a much larger speed-up by using input into the CREATE INDEX that isn’t already sorted, and/or has more expensive comparisons. A multi-column CREATE INDEX on text columns, for example.
I sometimes use this benchmark: https://github.com/petergeoghegan/gensort
Great work!
Does that also works for CIC (Create Index Concurrently)? Other question/hint: if i’m running a restore (custom-format) with -j , what would happen in the stage of creating indexes? Burning cores?
But anyway, great feature!
@AKRETSCHMER
Yes, it works for CREATE INDEX CONCURRENTLY. Only the first table scan/initial index build in parallelized. In other words, only the “CREATE INDEX” part of CIC is performed in parallel, though that’s still very likely to be where the majority of time is spent.
If you’re running a restore, then nothing special happens or doesn’t happen. You should have set max_parallel_workers in a way that prevents thrashing when many concurrent parallel operations are underway.
@Peter Geoghegan:
Updated the post with test based on your suggestion. Thanks.
PG 11b2
1. – If I set max_parallel_maintenance_workers > 0, restore plain-format backup, then try to run vacuum full analyze
I get : ERROR: catalog is missing 1 attribute(s) for relid 345832
2. max_parallel_maintenance_workers = 0 , restore plain-format backup , change max_.. to 4 , restart, run ‘vacuum full analyze’ > segfault :]