설문조사
PostgreSQL/PPAS 관련 듣고 싶은 교육은


총 게시물 105건, 최근 0 건
   

Scalar subquery caching and Memoization2

글쓴이 : 모델광 날짜 : 2022-02-05 (토) 10:45 조회 : 280

As expected the execution plan now tells us that Sort operation completed inside a view called v1; and we can see that Sort operation was done within work_mem, while the previous query was using external merge because work_mem was not sufficient. The elapsed time dropped from 750 to 666 ms because it sorted the narrower data set.


So it seems we can do better by sorting the thin data set in a query which requires Sort operation. Getting back to the topic of enable_memoize, we can rewrite the problematic query using lateral join.


select   e.id, e.ename, d.dept_name

 from   emp e,

lateral  (select *

           from dept d 

          where e.dept_id = d.id

          offset 0) d

order by e.ename;


Sort (actual time=326.492..376.167 rows=500000 loops=1)

  Output: e.id, e.ename, d.dept_name

  Sort Key: e.ename

  Sort Method: external merge  Disk: 66552kB

  Buffers: shared hit=10605, temp read=8319 written=8321

  ->  Nested Loop (actual time=0.031..178.767 rows=500000 loops=1)

        Output: e.id, e.ename, d.dept_name

        Buffers: shared hit=10605

        ->  Seq Scan on public.emp e (actual time=0.006..25.795 rows=500000 loops=1)

              Output: e.id, e.dept_id, e.ename, e.padding

              Buffers: shared hit=10205

        ->  Memoize (actual time=0.000..0.000 rows=1 loops=500000)

              Output: d.dept_name

              Cache Key: e.dept_id

              Hits: 499800  Misses: 200  Evictions: 0  Overflows: 0  Memory Usage: 40kB

              Buffers: shared hit=400

              ->  Subquery Scan on d (actual time=0.001..0.001 rows=1 loops=200)

                    Output: d.dept_name

                    Buffers: shared hit=400

                    ->  Index Scan using dept_pk on dept d (actual time=0.001..0.001 rows=1 loops=200)

                          Output: NULL::integer, d_1.dept_name, NULL::text

                          Index Cond: (d_1.id = e.dept_id)

                          Buffers: shared hit=400

Planning Time: 0.111 ms

Execution Time: 400.474 ms


In reading an execution plan there is one general tip (or warning, perhaps) that experienced SQL tuners offer.


If you start out by looking for one particular thing you will miss lots of important clues; on a first pass through the plan just try to notice anything that looks a little informative, then go back for a more detailed investigation on a second pass through the plan.


On a first pass through the execution plan, we can notice a new operation Memoize which is a PostgreSQL 14 new feature. Hits: 499800 and Misses: 200 tell us that it accessed an in-memory cache 499800 times and it did not incur any block I/Os.

What makes the query performant is the Subquery Scan on d operatioin. The optimizer is performing the Subquery Scan operation only 200 times, not 500,000. Like scalar subquery caching in Oracle, the result of a correlated inline view can be cached, or memoized, which reduces the number of index access from 500,000 to 200. As shown above, this has drastic effects in some cases where the query fetches only 400 buffer blocks from the in-memory cache by leveraging Memoize operation. But the plan above has ineffeiciency in sorting the data set of e.id, e.ename, and d.dept_name which is thick. So maybe it would be worth rewriting the query to sort the "narrower" data set before calling the correlated inline view:


select  e.id, e.ename, d.dept_name

 from    (

        select emp.id, emp.ename, emp.dept_id

        from  emp

        order by emp.ename

        offset 0

        ) e,

lateral (select *

           from dept d 

          where e.dept_id = d.id

          offset 0) d ;


Nested Loop (actual time=208.280..356.746 rows=500000 loops=1)

  Output: emp.id, emp.ename, d.dept_name

  Buffers: shared hit=10605

  ->  Sort (actual time=208.241..225.032 rows=500000 loops=1)

        Output: emp.id, emp.ename, emp.dept_id

        Sort Key: emp.ename

        Sort Method: quicksort  Memory: 50878kB

        Buffers: shared hit=10205

        ->  Seq Scan on public.emp (actual time=0.018..49.249 rows=500000 loops=1)

              Output: emp.id, emp.ename, emp.dept_id

              Buffers: shared hit=10205

  ->  Memoize (actual time=0.000..0.000 rows=1 loops=500000)

        Output: d.dept_name

        Cache Key: emp.dept_id

        Hits: 499800  Misses: 200  Evictions: 0  Overflows: 0  Memory Usage: 40kB

        Buffers: shared hit=400

        ->  Subquery Scan on d (actual time=0.001..0.001 rows=1 loops=200)

              Output: d.dept_name

              Buffers: shared hit=400

              ->  Index Scan using dept_pk on dept d (actual time=0.001..0.001 rows=1 loops=200)

                    Output: NULL::integer, d_1.dept_name, NULL::text

                    Index Cond: (d_1.id = emp.dept_id)

                    Buffers: shared hit=400

Planning Time: 0.364 ms

Execution Time: 369.616 ms


As expected the plan now sorts the data set of id, ename and dept_id which is narrower than the set of id, ename and dept_name. The elapsed time dropped from 400 to 369 ms.


Conclustion

1. When there is Sort operation, try to sort the narrower data set before getting the result set.

2. PostgreSQL 14's enable_memoize feature can be useful for nested loop joins in SQL.

   

Footnote

If your database server has large work_mem, you will not be able to witness external merge in Sort operation and the benefit of elapsed time reduction may not be noticeable. However it is apparent the optimizer consumes smaller work_mem when you sort the narrower data set. We have to try to leverage less system resources to keep things running at an optimal speed.


   

postgresdba.com