설문조사
PostgreSQL/PPAS 관련 듣고 싶은 교육은


총 게시물 110건, 최근 0 건
   

Distinct 2 columns

글쓴이 : 모델광 날짜 : 2022-07-08 (금) 22:42 조회 : 91

In the previous article I have taken advantage of the technique of extracting distinct values from an indexed column efficiently. The technique is so commonly used that many blog posts have introduced it. According to Jonathan Lewis, Sayan Malakshnov,a Russian, was one of those suspected developers who first found the strategy.

http://orasql.org/2012/09/21/distinct-values-by-index-topn/


In this note, I will explain how to find all the distinct combinations for the first two columns of a very large index in PostgreSQL without having to scan and aggregate the whole index or table. The solution in this note was worked out by Jonathan Lewis. I've tried to come up with a better solution than his, but I couldn't.


Let's start with a suitable table and index.


DROP TABLE T1;

CREATE TABLE T1

AS

SELECT i id

     , mod(i-1,3) val1

     , mod(i-1,10) val2

     , lpad('k',100,'k') dummy

  FROM generate_series(1,1e6) a(i)

 ORDER BY random();

ALTER TABLE T1 ADD CONSTRAINT T1_PK PRIMARY KEY(val1, val2, id);

ANALYZE T1;

SELECT relname, relpages, relpages*8/1024 as size_MB 

  FROM PG_CLASS

 WHERE RELNAME LIKE 't1%';

+---------+----------+---------+

| relname | relpages | size_MBytes |

+---------+----------+---------+

| t1      | 18182    | 142     |

| t1_pk   | 4511     | 35      |

+---------+----------+---------+


I have created a table with 3 values for val1, 10 values for val2, with a total of 30 combinations. The addition of the primary key starting with (val1, val2) is just a lazy way to ensure that I have a suitable index and val1 and val2 are both declared not null.


If the business requirement is to retrieve the distinct values for (val1, val2), you would have used a query like below:


SELECT DISTINCT VAL1, VAL2 

  FROM T1;


The following is the execution plan the optimizer produced.


 HashAggregate (actual time=437.168..437.172 rows=30 loops=1)

   Group Key: val1, val2

   Batches: 1  Memory Usage: 24kB

   Buffers: shared hit=11821 read=6361

   ->  Seq Scan on t1 (actual time=0.019..124.505 rows=1000000 loops=1)

         Buffers: shared hit=11821 read=6361

 Planning:

   Buffers: shared hit=122 read=31

 Planning Time: 3.880 ms

 Execution Time: 437.502 ms


To our disappointment, PostgreSQL didn't access the index we have. It visited 11821 memory blocks and 6361 disk blocks.

Another query that meets the business requirement would be like below:


SELECT VAL1, VAL2

  FROM T1

GROUP BY VAL1, VAL2;


 Group (actual time=151.274..155.047 rows=30 loops=1)

   Group Key: val1, val2

   Buffers: shared hit=11867 read=6329

   ->  Gather Merge (actual time=151.273..155.006 rows=90 loops=1)

         Workers Planned: 2

         Workers Launched: 2

         Buffers: shared hit=11867 read=6329

         ->  Sort (actual time=144.965..144.967 rows=30 loops=3)

               Sort Key: val1, val2

               Sort Method: quicksort  Memory: 26kB

               Buffers: shared hit=11867 read=6329

               Worker 0:  Sort Method: quicksort  Memory: 26kB

               Worker 1:  Sort Method: quicksort  Memory: 26kB

               ->  Partial HashAggregate (actual time=144.872..144.876 rows=30 loops=3)

                     Group Key: val1, val2

                     Batches: 1  Memory Usage: 24kB

                     Buffers: shared hit=11853 read=6329

                     Worker 0:  Batches: 1  Memory Usage: 24kB

                     Worker 1:  Batches: 1  Memory Usage: 24kB

                     ->  Parallel Seq Scan on t1 (actual time=0.038..31.946 rows=333333 loops=3)

                           Buffers: shared hit=11853 read=6329

 Planning:

   Buffers: shared hit=17 read=3 dirtied=1

 Planning Time: 0.547 ms

 Execution Time: 155.195 ms


The elapsed time dropped due to parallel processing. But it is still not accessing the index.


Now it is time to take advantage of the technique of getting distinct values using the Common Table Expression.

My first step will be to demonstrate the recursive CTE to get the distinct values for val1 using the index probe. I have included its execution plan to show that this does a minimal amount of work.


WITH RECURSIVE w1  AS (

(SELECT val1, val2  FROM t1 ORDER BY val1, val2 LIMIT 1)    --non-recursive term

UNION ALL

SELECT t1.val1, t1.val2                                                     --recursive term

  FROM w1 

JOIN LATERAL (SELECT val1, val2

                        FROM t1

                     WHERE t1.val1 > w1.val1

                 ORDER BY t1.val1, t1.val1

                   LIMIT 1) t1 ON TRUE

WHERE w1.val1 IS NOT NULL

    AND w1.val2 IS NOT NULL

)

SELECT * FROM w1

WHERE w1.val1 IS NOT NULL

    AND w1.val2 IS NOT NULL;


 CTE Scan on w1 (actual time=0.181..0.411 rows=3 loops=1)

   Filter: ((val1 IS NOT NULL) AND (val2 IS NOT NULL))

   Buffers: shared hit=8 read=6

   CTE w1

     ->  Recursive Union (actual time=0.178..0.407 rows=3 loops=1)

           Buffers: shared hit=8 read=6

           ->  Limit (actual time=0.176..0.177 rows=1 loops=1)

                 Buffers: shared hit=1 read=3

                 ->  Index Only Scan using t1_pk on t1 (actual time=0.174..0.174 rows=1 loops=1)

                       Heap Fetches: 0

                       Buffers: shared hit=1 read=3

           ->  Nested Loop (actual time=0.075..0.075 rows=1 loops=3)

                 Buffers: shared hit=7 read=3

                 ->  WorkTable Scan on w1 w1_1 (actual time=0.001..0.001 rows=1 loops=3)

                       Filter: ((val1 IS NOT NULL) AND (val2 IS NOT NULL))

                 ->  Limit (actual time=0.072..0.072 rows=1 loops=3)

                       Buffers: shared hit=7 read=3

                       ->  Index Only Scan using t1_pk on t1 t1_1 (actual time=0.064..0.064 rows=1 loops=3

                             Index Cond: (val1 > w1_1.val1)

                             Heap Fetches: 0

                             Buffers: shared hit=7 read=3

 Planning:

   Buffers: shared hit=3 read=4

 Planning Time: 0.863 ms

 Execution Time: 0.639 ms


At first sight we can spot that we've got an Index Only Scan as the first step of the recursive query, visiting just 1 buffered block and three disk blocks. The Limit in the plan indicates that it stopped reading the index after getting a single row. So PostgreSQL is clearly doing an efficient access for that value. We then see Nested Loop operation executed three times. By using LATERAL JOIN the inline view after the LATERAL is run for every w1.val1 before the LATERAL. The inline view is also accessed via the t1_pk index.

So we can get the val1 values very easily and efficiently with this recursive CTE technology. Using the same technology we can write some code to find the val2 values for each possible val1 value in turn:


WITH RECURSIVE w1 AS (

(SELECT val1, val2 FROM t1 ORDER BY val1, val2 LIMIT 1)

UNION ALL

SELECT t1.val1, t1.val2

  FROM w1 

JOIN LATERAL (SELECT val1, val2

                         FROM t1

                       WHERE t1.val1 > w1.val1

                     ORDER BY t1.val1, t1.val2

                       LIMIT 1) t1 ON TRUE

WHERE w1.val1 IS NOT NULL

),

w2 AS (

SELECT val1, val2 FROM w1

UNION ALL

SELECT w2.val1, (SELECT t1.val2

                            FROM t1

                          WHERE t1.val1 = w2.val1

                               AND t1.val2 > w2.val2

                          ORDER BY t1.val1, t1.val2

                          LIMIT 1) val2

  FROM w2

 WHERE w2.val2 is not null

     AND w2.val1 is not null

)

SELECT *

   FROM w2

WHERE w2.val2 is not null

;


In this query the second CTE (w2) looks similar to the first CTE (w1) - except I now have included an equality predicate against val1 based on the first of the two columns. The second CTE gets distinct val2 values for each val1 value. The following is the execution plan.


 CTE Scan on w2 (actual time=0.060..3.028 rows=30 loops=1)

   Filter: (val2 IS NOT NULL)

   Rows Removed by Filter: 3

   Buffers: shared hit=84 read=22

   CTE w1

     ->  Recursive Union (actual time=0.058..0.258 rows=3 loops=1)

           Buffers: shared hit=14

           ->  Limit (actual time=0.045..0.046 rows=1 loops=1)

                 Buffers: shared hit=4

                 ->  Index Only Scan using t1_pk on t1 (actual time=0.044..0.044 rows=1 loops=1)

                       Heap Fetches: 0

                       Buffers: shared hit=4

           ->  Nested Loop (actual time=0.065..0.065 rows=1 loops=3)

                 Buffers: shared hit=10

                 ->  WorkTable Scan on w1 (actual time=0.000..0.000 rows=1 loops=3)

                       Filter: (val1 IS NOT NULL)

                 ->  Limit (actual time=0.060..0.060 rows=1 loops=3)

                       Buffers: shared hit=10

                       ->  Index Only Scan using t1_pk on t1 t1_1 (actual time=0.056..0.056 rows=1 lops=3

                             Index Cond: (val1 > w1.val1)

                             Heap Fetches: 0

                             Buffers: shared hit=10

   CTE w2

     ->  Recursive Union (actual time=0.059..3.006 rows=33 loops=1)

           Buffers: shared hit=84 read=22

           ->  CTE Scan on w1 w1_1 (actual time=0.058..0.259 rows=3 loops=1)

                 Buffers: shared hit=14

           ->  WorkTable Scan on w2 w2_1 (actual time=0.079..0.248 rows=3 loops=11)

                 Filter: ((val2 IS NOT NULL) AND (val1 IS NOT NULL))

                 Rows Removed by Filter: 0

                 Buffers: shared hit=70 read=22

                 SubPlan 2

                   ->  Limit (actual time=0.089..0.089 rows=1 loops=30)

                         Buffers: shared hit=70 read=22

                         ->  Index Only Scan using t1_pk on t1 t1_2 (actul time=0.088..0.088 rows=1 loops=30

                               Index Cond: ((val1 = w2_1.val1) AND (val2 > w2_1.val2))

                               Heap Fetches: 0

                               Buffers: shared hit=70 read=22

 Planning Time: 0.679 ms

 Execution Time: 3.230 ms


The execution plan above is too complex to read in one go. Let's take a look at the CTE w2 part. Since we've got 3 rows from CTE w1, the first step of the recursive query produces 3 rows.  The SubPlan2 represents the scalar subquery. The execution plan shows that we've got 30 Index Only Scans with 70 shared hits and 22 disk I/Os. And that's exactly the right number of probes to return my result set.

You will notice that I have got various "is not null" predicates scattered throughout the query. In some cases this is to stop PostgreSQL from running into an infinite loop. It is kind of a default pattern for recursive CTEs and some of them could probably be removed with no change in meaning.


Wrap Up

We can take advantage of the recursive CTE technology to pick out the unique set of combinations of the first two columns of a multi-column index.

The downside to using the recursive CTE technology is that if there are too many distinct values, for example over 10,000, it will have poor performance.


Addendum

You can use this version of the last query as well.


WITH RECURSIVE w1 AS (

(SELECT val1, val2 FROM t1 ORDER BY val1, val2 LIMIT 1)

UNION ALL

SELECT t1.val1, t1.val2

  FROM w1 

JOIN LATERAL (SELECT val1, val2

               FROM t1

              WHERE t1.val1 > w1.val1

             ORDER BY t1.val1, t1.val2

             LIMIT 1) t1 ON TRUE

WHERE w1.val1 IS NOT NULL

),

w2 AS(

SELECT val1, val2  FROM w1

UNION ALL

SELECT t1.val1, t1.val2

  FROM w2

JOIN LATERAL (SELECT val1, val2

                FROM t1

               WHERE t1.val1 = w2.val1

                 AND t1.val2 > w2.val2

               ORDER BY t1.val1, t1.val2

               LIMIT 1) t1 ON TRUE

 WHERE w2.val2 is not null

   AND w2.val1 is not null

)

SELECT * FROM w2

WHERE w2.val2 is not null

;


   

postgresdba.com