설문조사
PostgreSQL/PPAS 관련 듣고 싶은 교육은


총 게시물 125건, 최근 1 건
   

first_value() vs. lag()

글쓴이 : 모델광 날짜 : 2022-08-06 (토) 20:52 조회 : 272

This article is about a subtle performance difference you can exeprience when you port SQL statements from Oracle to PostgreSQL.
In a migration project I came across an SQL statement of the following shape running on an AS-IS database (Oracle):

select
        ID,
        VISIT_DT,
        LEAVE_DT,
        LAST_VALUE(LEAVE_DT) OVER (PARTITION BY ID ORDER BY VISIT_DT
                    ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) as PREV_LEAVE_DT,
        FIRST_VALUE(LEAVE_DT) OVER (PARTITION BY ID ORDER BY VISIT_DT
                  ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) as NEXT_LEAVE_DT,
        row_number() over (partition by ID order by VISIT_DT) as row_num
from
        t1;

We can notice that the query used a row_number(), a last_value() and a first_value() of a column called LEAVE_DT. All three analytic columns use the same partitioning and ordering. But those three columns have different frame_clause. When we take a look at the last_value() closely, we can see that the last_value() clause is implementing a function lag(). And the first_value() clause is implementing a function lead().

You may wonder, "why did the developer use last_value() and first_value() instead of lag() and lead()? Do they have the same performance? Will the query do three "WindowAgg" in the execution plan?"

From now on I will show you how the query works in PostgreSQL and how we can improve the performance. And in the last part of this article I will explain why the query is optimal in Oracle but is not optimal in PostgreSQL.

Here is a test script that I have been running on PostgreSQL 14.

DROP TABLE t;
CREATE TABLE t
AS
SELECT i as ID,
       '2022-05-17'::TIMESTAMP + RANDOM()*('2023-05-17'::TIMESTAMP - '2022-05-17'::TIMESTAMP) as VISIT_DT,
       '2022-05-17'::TIMESTAMP + RANDOM()*('2023-06-17'::TIMESTAMP - '2022-05-17'::TIMESTAMP) as LEAVE_DT
  FROM GENERATE_SERIES(1,100000) a(i), GENERATE_SERIES(1,40);

SELECT relname, reltuples, relpages*8/1000 as size_MB
  FROM pg_catalog.pg_class
 WHERE RELNAME = 't';

+---------+-----------+---------+
| relname | reltuples | size_mb |
+---------+-----------+---------+
| t           | 4000000.0 | 203     |
+---------+-----------+---------+

I have created a table with 4,000,000 rows, consisting of 100,000 ID values, and 40 rows per ID. The 40 rows for an ID have a different VISIT_DT and a different LEAVE_DT.

After creating the data set, I have created another table (t2) using the last_value() and first_value() functions to check the speed of the original query.

create table t2
as
select
        ID,
        VISIT_DT,
        LEAVE_DT,
        LAST_VALUE(LEAVE_DT) OVER (PARTITION BY ID ORDER BY VISIT_DT
                 ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) as PREV_LEAVE_DT,
        FIRST_VALUE(LEAVE_DT) OVER (PARTITION BY ID ORDER BY VISIT_DT
                 ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) as NEXT_LEAVE_DT,
        row_number() over (partition by ID order by VISIT_DT) as row_num
from
        t
;
Time: 7693.183 ms (00:07.693)

I have created another table (t1) using the lag() and lead() functions to generate a previous and next LEAVE_DT for each row, partitioning by ID, ordering by VISIT_DT.

create table t1
as
select
        ID,
        VISIT_DT,
        LEAVE_DT,
        lag (LEAVE_DT, 1) over (partition by ID order by VISIT_DT) as PREV_LEAVE_DT,
        lead(LEAVE_DT, 1) over (partition by ID order by VISIT_DT) as NEXT_LEAVE_DT,
        row_number() over (partition by ID order by VISIT_DT) as row_num
from
        t
;
Time: 6266.032 ms (00:06.266)

Note that the time to create the t1 table was 6266 ms while the time to create the t2 table was 7693 ms.
When I repeated the above test, creating the table t1 using the lag() and lead() functions was always faster.
To understand what is really going on here, we need to take a look at the execution plan provided by PostgreSQL:

Here is the execution plan when we use the last_value() and first_value() functions.

WindowAgg  (cost=586142.37..806142.37 rows=4000000 width=44)
  Output: id, visit_dt, leave_dt, (last_value(leave_dt) OVER (?)), (first_value(leave_dt) OVER (?)), row_number() OVER (?)
  ->  WindowAgg  (cost=586142.37..736142.37 rows=4000000 width=36)
        Output: id, visit_dt, leave_dt, (last_value(leave_dt) OVER (?)), first_value(leave_dt) OVER (?)
        ->  WindowAgg  (cost=586142.37..666142.37 rows=4000000 width=28)
              Output: id, visit_dt, leave_dt, last_value(leave_dt) OVER (?)
              ->  Sort  (cost=586142.37..596142.37 rows=4000000 width=20)
                    Output: id, visit_dt, leave_dt
                    Sort Key: t.id, t.visit_dt
                    ->  Seq Scan on public.t  (cost=0.00..65478.00 rows=4000000 width=20)
                          Output: id, visit_dt, leave_dt

First of all, PostgreSQL has to read the data and sort by "T.ID and T.VISIT_DT". This sorted data is fed to the first aggregation (last_value()), before it is passed on to the next aggregation(first_value()). Lastly the second WindowAgg passes its output to
the last WindowAgg(row_number()). Overall our data has to go through WindowAgg three times, which is not a good thing to do.

Here is the execution plan when we use the lag() and lead() functions.

WindowAgg  (cost=586142.37..686142.37 rows=4000000 width=44)
  Output: id, visit_dt, leave_dt, lag(leave_dt, 1) OVER (?), lead(leave_dt, 1) OVER (?), row_number() OVER (?)
  ->  Sort  (cost=586142.37..596142.37 rows=4000000 width=20)
        Output: id, visit_dt, leave_dt
        Sort Key: t.id, t.visit_dt
        ->  Seq Scan on public.t  (cost=0.00..65478.00 rows=4000000 width=20)
              Output: id, visit_dt, leave_dt

Wow, this plan tells us that PostgreSQL used WindowAgg operation just once, which is the reason why the query is faster. If you are processing a big data set, this can make a huge difference and speed up your queries tremendously.
Compared to the query with the lag() and lead() functions, the query with the last_value() and first_value() functioins have different frame_clause:

ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING

It seems that when the frame_clause is identical PostgreSQL windowing functions can share WidowAgg operation. When the frame_clause is of different shape, PostgreSQL windowing functions have to apply WindowAgg to each windowing function.

Most people simply write their SQL code and execute it, assuming that the optimizer will take care of things on its own.  While this is usually the case there are still corner cases, where you have to check execution plans and tweak the SQL statement.

Conclusion
When your query has windowing functions and they have different window frame clauses, try to find a way to make those frame clauses identical. For example you may replace last_value() and first_value() with lag() and lead().

Footnote
I did the same test in Oracle database. Unlike PostgreSQL, the query with the last_value() and first_value() functions was faster in Oracle. In Oracle version 19.1 the time to create the t1 table (lag(), lead()) was 8540 ms while the time to create the t2 table was 7610 ms. When I used the DBMS_XPLAN.DISPLAY_CURSOR function to display the execution plan, I was able to see the transformed query by the optimizer in the Column Projection Information section of the output.

The following was the SQL code I ran under Oracle versioin 19.1.
select
        ID,
        VISIT_DT,
        LEAVE_DT,
        lag (LEAVE_DT, 1) over (partition by ID order by VISIT_DT) as PREV_LEAVE_DT,
        lead(LEAVE_DT, 1) over (partition by ID order by VISIT_DT) as NEXT_LEAVE_DT,
        row_number() over (partition by ID order by VISIT_DT) as row_num
from
        t
;

Here is the transformed query by the Oracle optimizer. I got this query in the Column Projection Information section of the DBMS_XPLAN.DISPLAY_CURSOR output.

SELECT ID,
          VISIT_DT,
          LEAVE_DT,
          COUNT(*) OVER (PARTITION BY ID ORDER BY VISIT_DT
                                ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING),
          FIRST_VALUE(LEAVE_DT) OVER (PARTITION BY VISIT_DT
                                ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS PREV_LEAVE_DT,
          COUNT(*) OVER (PARTITION BY ID ORDER BY VISIT_DT
                                ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING),
          FIRST_VALUE(LEAVE_DT) OVER (PARTITION BY ID ORDER BY VISIT_DT
                                ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) AS NEXT_LEAVE_DT,
          ROW_NUMBER() OVER (PARTITION BY ID ORDER BY VISIT_DT) AS ROWNUM
  FROM T;
     
In Oracle the lag() function turns into a first_value() function that looks at the row preceding the current row in the partition (rows between 1 preceding and 1 preceding). However it first had to count over the same clause. I can't explain why Oracle has to count over the same clause. The call to lead() basically does the same thing, but uses (rows between 1 following and 1 following) to access the next row in the partition. It seems that the COUNT(*) OVER clause is the culprit in the poor performance.
The developer who wrote the SQL statement with the last_value() and first_value() functions in Oracle must have been a professioinal. Before doing this test, I did not know there were performance differences between those queries.


   

postgresdba.com