This is a follow-up to a note titled "Correlated Subquery Collapse" published on May 28, 2022. You can find the note at the following link:https://www.postgresdba.com/bbs/board.php?bo_table=C05&wr_id=200&page=3Before continuing with this article, please make sure to read the previous one.In the previous note, I mentioned that PostgreSQL has its limitations when it comes to unnesting a correlated subquery. I demonstrated how we can improve query performance by rewriting the query. At that time, I examined how Oracle transformed the query in question and utilized the Oracle-transformed query in PostgreSQL to enhance performance. The Oracle version used was 12.2.0.1.
Here is the rewritten query, followd by its execution plan in PostgreSQL 15.1.
I will call this the first query from now on.
SELECT EMPNO, SAL, COMM, DEPTNO
FROM (
SELECT A.EMPNO, A.SAL, A.COMM, A.DEPTNO, AVG(B.COMM +B.SAL) AS AVG_VAL
FROM EMPLOYEE A, EMPLOYEE B
WHERE A.DEPTNO = B.DEPTNO
GROUP BY A.DEPTNO, A.EMPNO, A.SAL, A.COMM
) X
WHERE SAL + COMM > AVG_VAL
ORDER BY DEPTNO, EMPNO;
Sort (actual time=3382.842..3383.083 rows=5014 loops=1)
Sort Key: x.deptno, x.empno
Sort Method: quicksort Memory: 506kB
Buffers: shared hit=220
-> Subquery Scan on x (actual time=3360.295..3366.464 rows=5014 loops=1)
Buffers: shared hit=220
-> HashAggregate (actual time=3360.294..3366.065 rows=5014 loops=1)
Group Key: a.empno
Filter: ((a.sal + a.comm) > avg((b.comm + b.sal)))
Batches: 1 Memory Usage: 5009kB
Rows Removed by Filter: 4986
Buffers: shared hit=220
-> Hash Join (actual time=2.525..1152.728 rows=8333336 loops=1)
Hash Cond: (a.deptno = b.deptno)
Buffers: shared hit=220
-> Seq Scan on employee a (actual time=0.006..2.203 rows=10000 loops=1)
Buffers: shared hit=110
-> Hash (actual time=2.478..2.479 rows=10000 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 587kB
Buffers: shared hit=110
-> Seq Scan on employee b (actual time=0.003..1.133 rows=10000 loops=1)
Buffers: shared hit=110
Planning:
Buffers: shared hit=28
Planning Time: 0.217 ms
Execution Time: 3383.719 ms
By rewriting the query we were able to reduce the elapsed time from 13.6 seconds to 3.3 seconds.
Recently, I conducted a test on the same query using Oracle 23c, the most recent version, and discovered that Oracle was transforming the query differently.
Please note that in both Oracle versions, I did not gather table and index statistics.
Here is the execution plan I got when I ran the query under Oracle 23.1.0.-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 5004 |00:00:00.01 | 166 | | | |
| 1 | SORT ORDER BY | | 1 | 500 | 5004 |00:00:00.01 | 166 | 267K| 267K| 237K (0)|
|* 2 | HASH JOIN | | 1 | 500 | 5004 |00:00:00.01 | 166 | 1335K| 1335K| 1110K (0)|
| 3 | VIEW | VW_SQ_1 | 1 | 12 | 12 |00:00:00.01 | 83 | | | |
| 4 | HASH GROUP BY | | 1 | 12 | 12 |00:00:00.01 | 83 | 1345K| 1345K| |
| 5 | TABLE ACCESS FULL| EMPLOYEE | 1 | 10000 | 10000 |00:00:00.01 | 83 | | | |
| 6 | TABLE ACCESS FULL | EMPLOYEE | 1 | 10000 | 10000 |00:00:00.01 | 83 | | | |
-----------------------------------------------------------------------------------------------------------------------
The following is the execution plan obtained in Oracle 12.2.0.1.
----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 5035 |00:00:09.77 | 169 | | | |
|* 1 | FILTER | | 1 | | 5035 |00:00:09.77 | 169 | | | |
| 2 | SORT GROUP BY | | 1 | 82 | 10000 |00:00:09.76 | 169 | 1045K| 1045K| 928K (0)|
|* 3 | HASH JOIN | | 1 | 82 | 8333K|00:00:00.72 | 169 | 1888K| 1888K| 1624K (0)|
| 4 | TABLE ACCESS FULL| EMPLOYEE | 1 | 82 | 10000 |00:00:00.01 | 84 | | | |
| 5 | TABLE ACCESS FULL| EMPLOYEE | 1 | 82 | 10000 |00:00:00.01 | 84 | | | |
----------------------------------------------------------------------------------------------------------------------
It was at this point that I realized that the modified query in the previous note was not optimal.
In Oracle 23.1.0, the correlated subquery was collapsed, but the resuling inline view was not merged with the main query. It performed the HASH GROUP BY operation before joining the EMPLOYEE table in the main query.
On the other hand, in Oracle 12.2.0.1, the correlated subqery was unnested, and the resuling inline view was merged with the main query. After merging, it used a HASH JOIN operation and then performed a SORT GROUP BY operation.
In Oracle 12.2.0.1, the estimated cardinality of the EMPLOYEE table is much smaller compared to the actual cardinality, which leads us to believe that the plan might not be optimal. On the other hand, the estimated cardinality (E-Rows) in Oracle 23c is quite accurate, which leads us to believe that the plan might be efficient. Therefore, we can deduce that the rewritten query in the previous note may not be the optimal query.
Therefore, I have rewritten the query in question as follows, which I will refer to as the second query in the following paragraphs:SELECT A.EMPNO, A.SAL, A.COMM, A.DEPTNO
FROM EMPLOYEE A,
(SELECT DEPTNO, AVG(B.SAL+COALESCE(B.COMM,0)) AS AVG_SAL
FROM EMPLOYEE B
GROUP BY DEPTNO
) B
WHERE A.DEPTNO = B.DEPTNO
AND (A.SAL + COALESCE(A.COMM)) > B.AVG_SAL
ORDER BY A.DEPTNO, A.EMPNO;
;
Here is the plan I got when I ran the above query under PostgreSQL 15.1.
Note, particularly, the figure for Execution Time and shared hit:
Sort (actual time=13.961..14.187 rows=5014 loops=1)
Sort Key: a.deptno, a.empno
Sort Method: quicksort Memory: 506kB
Buffers: shared hit=220
-> Hash Join (actual time=3.591..7.771 rows=5014 loops=1)
Hash Cond: (a.deptno = b.deptno)
Join Filter: ((a.sal + COALESCE(a.comm)) > (avg((b.sal + COALESCE(b.comm, '0'::numeric)))))
Rows Removed by Join Filter: 4986
Buffers: shared hit=220
-> Seq Scan on employee a (actual time=0.006..0.576 rows=10000 loops=1)
Buffers: shared hit=110
-> Hash (actual time=3.578..3.579 rows=12 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=110
-> HashAggregate (actual time=3.569..3.574 rows=12 loops=1)
Group Key: b.deptno
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=110
-> Seq Scan on employee b (actual time=0.001..0.565 rows=10000 loops=1)
Buffers: shared hit=110
Planning Time: 0.094 ms
Execution Time: 14.397 ms
What a surprise! Even though we can not oberve a performance improvement in the number of block I/Os, the elapsed time has dropped from 3383 ms to 14 ms. When I wrote the previous note, I had missed one important tuning principle:
Perform a group by operation before joining, that is to say, reduce the number of rows participating in the join.
In the first query, 10000 rows from the EMPLOYEE in the main block and 10000 rows from the EMPLOYEE in the subquery block took part in the join, resuling in an intermediate data set of 8333336 rows. In the second query, 10000 rows from the EMPLOYEE in the main block and 12 rows from the EMPLOYEE in the subquery block took part in the join. That is why the second plan performs significantly better despite having the same number of block I/Os.
Conclusion
PostgreSQL has a prominent limitation of not being able to unnest a correlated aggregate subquery. To overcome this, we need to re-engineer a query that is performing badly and strive to reduce the number of rows participating in the join.
Footnote
When I gathered statistics information in Oracle 12c, I was able to obtain the same execution plan as in Oracle 23c.