During an Oracle to PostgreSQL migration, I was tasked with optimizing a query that ran twice as slow in PostgreSQL. The migrated query in PostgreSQL appeared somewhat convoluted. I'll postpone the script to create the test data to the end of this note and start with the query that required optimizations in PostgreSQL 13.2:
SELECT ORD_NO , CUST_NO , COUNT(*) FILTER (WHERE RN1=1) OVER () AS C_CNT , COUNT(*) FILTER (WHERE RN2=1) OVER () AS O_CNT FROM ( SELECT ORD_NO , CUST_NO , ROW_NUMBER() OVER (PARTITION BY CUST_NO) AS RN1 , ROW_NUMBER() OVER (PARTITION BY ORD_STATUS_CD) AS RN2 FROM ONLINE_ORDER ) FOO;
And here is the resuting execution plan obtained when I ran the query in PostgreSQL 13.2.
In the migrated query, we are calculating the count of distinct CUST_NO values and ORD_STATUS_CD values across the entire set of rows. We know that the FILTER clause followed by COUNT(*) is specific to PostgreSQL. This led me to wonder how the original query in the Oracle database was structured. And the query in Oracle was of the following format:
SELECT ORD_NO , CUST_NO , COUNT(DISTINCT CUST_NO) OVER () AS C_CNT , COUNT(DISTINCT ORD_STATUS_CD) OVER () AS O_CNT FROM ONLINE_ORDER;
For your reference, here is the execution plan pulled from memory on Oracle 12.2.
We can observe that the query is twice as fast in Oracle (1.96 seconds vs. 0.94 seconds). When I asked the developer who converted the query why he did change the query in PostgreSQL, he said that the query with COUNT(DISTINCT col_name) OVER () did not work in PostgreSQL. When I ran the Oracle query in PostgreSQL, I got this error message:
ERROR: DISTINCT is not implemented for window functions
So the query in PostgreSQL, converted by the develper, appeared to be a reasonable strategy in extracting the count of disticnt values of two columns. However, the problem was that the query was slower than in the Oracle database. When we take a closer look at the execution plan in PostgreSQL, we can observe that we have two Sort operations and three WindowAgg operations in PostgreSQL. Both the Sort and WindowAgg operations are quite expensive.
One of the principles in optimizing a query is to reduce the Sort, WindowAgg, GroupAgg operations. Considering the principle, we can reconstruct the query as follows:
WITH counts AS ( SELECT COUNT(DISTINCT CUST_NO) AS C_CNT, COUNT(DISTINCT ORD_STATUS_CD) AS O_CNT FROM ONLINE_ORDER ) SELECT ORD_NO, CUST_NO, (SELECT C_CNT FROM counts) AS C_CNT, (SELECT O_CNT FROM counts) AS O_CNT FROM ONLINE_ORDER;
Here is the break-down of the execution plan: 1) Main Query (Seq Scan on online_order node) It is a sequential scan of the table online_order, which returned 1,000,000 rows. 2) CTE counts node - This node is based on an Aggregate operation, which suggests that it is calculating some aggregated values. - The CTE scans the same online_order table (aliased as online_order_1) that the main query scan. 3) InitPlan2 and 3 - There are two "InitPlan" operations, denoted as $1 and $2. - They access the results of CTE "count" to retrieve values from it.
Note that there is only one Aggregate node. Even though we are accessing the online_order table two times, the elapsed time dropped from 1931 to 712 ms.
In PostgreSQL, the "InitPlan" node is executed only once. When we look at the actual time part of each node, we can observe that the "CTE counts" node was executed before other nodes were executed.
For entertainment, I executed the revised query in Oracle 12c. When I executed the revised query in Oracle, the query became faster as well. Here is the plan of the restructured query in Oracle 12c:
In Oracle, the database accessed the ONLINE_ORDER table twice. However, the two WINDOW SORT operations were replaced by a single HASH GROUP BY operation. It is worth noting that the elapsed time descreased from 0.94 to 0.59, even though the block I/O increased from 19014 to 47839.
Conclusion Whenever you have the need to use analytic functions, it is worth just pausing for a second, and making sure that there is no other alternative to construct the query.
The main point of this note is to point out that it's a good idea to remove SORT operations when we start playing around with analytic functions. At times the strategy of accessing a table twice is a viable option to avoid sort operations. In particular, be cautious of using analytical functions.
Sometimes, you might reduce block I/O by leveraging window functions but that implementation comes at the cost of longer elapsed time. When we eliminate analytic functions, we have a better chance of achieving improved performance.
Footnote The developer who ported the Oracle query to PostgreSQL, was, of course, very smart to use PostgreSQL specific syntax. Query changes in a migration project may introduce significant side effects. What we have to do is find big wins using simple strategies that avoid complicated code changes - we need to be sure that we haven't introduced side-effects that mean we have moved the problem rather than solving it.
Test Code If you want to experiment further, here is the SQL to create the test data in PostgreSQL 13.2:
drop table online_order; create table online_order ( ord_no bigint not null, cust_no int not null, ord_date timestamp not null, ord_dt varchar(8) not null, ord_status_cd varchar(1) not null, comment varchar(100) );
INSERT into online_order select i, mod(i,1000000) as cust_no ,current_date - mod(i,1000) as ord_date ,to_char((current_date - mod(i,1000)),'yyyymmdd') as ord_dt ,(mod(i,4) + 1) as ord_status_cd ,lpad('x',100,'x') from generate_series(1,2000000,2) a(i);