On 7th of April 2023, Tom Lane committed patch:
Add array_sample() and array_shuffle() functions. These are useful in Monte Carlo applications. Martin Kalcher, reviewed/adjusted by Daniel Gustafsson and myself Discussion: https://postgr.es/m/9d160a44-7675-51e8-60cf-6d64b76db831@aboutsource.net
Arrays are supported in PostgreSQL since forever, documentation for Pg15 lists 18 functions. And now, we're getting two more:
- array_sample() – get some (one or more) random elements from given array
- array_shuffle() – to randomize order of elements in given array
Let's see how that works:
$ SELECT array_sample('{a,b,c,d,e,f,g,h,i,j,k}'::text[], 1); array_sample ────────────── {b} (1 ROW) $ SELECT array_sample('{a,b,c,d,e,f,g,h,i,j,k}'::text[], 5); array_sample ────────────── {a,h,c,j,e} (1 ROW)
If you'd pick sample size (second argument) that is larger than source table, you will get an error:
$ SELECT array_sample('{a,b,c,d,e,f,g,h,i,j,k}'::text[], 20); ERROR: sample SIZE must be BETWEEN 0 AND 11
As for array_shuffle, it's work is pretty obvious:
$ SELECT array_shuffle('{a,b,c,d,e,f,g,h,i,j,k}'::text[]); array_shuffle ───────────────────────── {e,j,k,c,h,d,i,g,f,b,a} (1 ROW)
In case of multi-dimensional arrays, the shuffling happens on first dimension, that is sub-arrays will remain in their original order:
$ SELECT array_sample( '{{a,b,c},{d,e,f},{g,h,i},{j,k,l},{m,n,o}}'::text[][], 3 ); array_sample ─────────────────────────── {{j,k,l},{a,b,c},{g,h,i}} (1 ROW) $ SELECT array_shuffle( '{{a,b,c},{d,e,f},{g,h,i},{j,k,l},{m,n,o}}'::text[][] ); array_shuffle ─────────────────────────────────────────── {{m,n,o},{g,h,i},{a,b,c},{j,k,l},{d,e,f}} (1 ROW)
Pretty cool. Now, if we only could get official, in core, array_sort, array_unique, that would be awesome 🙂
Anyway, these two functions are definitely useful, so thanks a lot to everyone involved.
Useful for what?
https://en.wikipedia.org/wiki/Monte_Carlo_method