This note is only for those who use Enterprise DB or Oracle.
If
you are concerned with vanilla PostgreSQL, you are not advised to read
this note. If you use Oracle or Enterprised DB, this article is worth
reading.
Now
I am working as a DBA in a project where we use EDB and I am frequently
confronted with SQL statements which use the NVL function. NVL is just
an old Oracle syntax for COALESCE and Oracle still retains NVL for
backward compatibility. EDB seems to implement the NVL function to make
migration from Oracle smooth. But the NVL function has two serious drawbacks.
In this post I will give you the reason why you should avoid using the NVL function in Oracle or EDB.
Here is some code to create the test data set:
--tested on PostgreSQL 14.4 (EnterpriseDB Advanced Server 14.4.0)
create table ORDERS (ord_no bigint,cust_id varchar(20),comment varchar(100),ord_date varchar(8));
ALTER TABLE ORDERS ADD CONSTRAINT PK_ORDERS PRIMARY KEY(ORD_NO);
CREATE
TABLE ORDERS_DETAIL(ORD_LINE_NO BIGINT NOT NULL,ORD_NO BIGINT NOT
NULL,PROD_ID VARCHAR(10) NOT NULL,COMMENT VARCHAR(100),ORD_AMT BIGINT);
ALTER TABLE ORDERS_DETAIL ADD CONSTRAINT PK_ORDERS_DETAIL PRIMARY KEY(ORD_LINE_NO);
CREATE INDEX ORDERS_DETAIL_X01 ON ORDERS_DETAIL(ORD_NO, PROD_ID);
CREATE INDEX ORDERS_DETAIL_X02 ON ORDERS_DETAIL(ORD_NO, ORD_AMT);
insert into ORDERS
select i as ord_no
, 'C'||mod(i,10) as cust_id
, lpad('X',10,'Y') as comment
, to_char(to_date('20191001','YYYYMMDD')+mod(i,60),'yyyymmdd') as ord_date
from generate_series(1,1000000) a(i);
INSERT INTO ORDERS_DETAIL
SELECT i as ORD_LINE_NO
, mod(i,1000000) AS ORD_NO
, 'PP'||MOD(i,5) AS PROD_ID
, lpad('X',10,'Y') as comment
, case when i < 1000 then i*100 else i end as prod_amt
FROM generate_series(1,10000000) a(i);
I
have created the ORDERS and ORDERS_DETAIL table with the size of
65Mbyters and 650Mbytes respectively. For testing purposes I have run
the following SQL statement.
SELECT ORD_NO
,CUST_ID
,NVL(A.COMMENT,(SELECT MAX(PROD_ID)
FROM ORDERS_DETAIL B WHERE A.ORD_NO = B.ORD_NO))
FROM ORDERS A
WHERE ORD_NO <= 100000;
Note that I have used the NVL function in the SELECT list. And we know that the COMMENT column has no NULL values.
Think
for a moment; Will EDB run the scalar subquery even though there are no
NULL values in the COMMENT column? For your reference, if you run the
above query in Oracle, Oracle executes the scalar subquery, which is
inefficient.
Here is the plan I got when I ran the query under EDB 14.4:
Index Scan using pk_orders on scott.orders a (cost=0.42..84258.58 rows=98893 width=43) (actual time=0.048..199.752 rows=100000 loops=1)
Output: a.ord_no, a.cust_id, nvl(a.comment, ((SubPlan 2))::character varying)
Index Cond: (a.ord_no <= 100000)
Buffers: shared hit=302546
SubPlan 2
-> Result (cost=0.80..0.81 rows=1 width=32) (actual time=0.001..0.002 rows=1 loops=100000)
Output: $1
Buffers: shared hit=301534
InitPlan 1 (returns $1)
-> Limit (cost=0.43..0.80 rows=1 width=32) (actual time=0.001..0.001 rows=1 loops=100000)
Output: ((b.prod_id)::text)
Buffers: shared hit=301534
-> Index Only Scan Backward using orders_detail_x01 on
scott.orders_detail b (cost=0.43..4.13 rows=10 width=32) (actual
time=0.001..0.001 rows=1 loops=100000)
Output: b.prod_id
Index Cond: ((b.ord_no = a.ord_no) AND (b.prod_id IS NOT NULL))
Heap Fetches: 0
Buffers: shared hit=301534
Query Identifier: 8902623697392898361
Planning Time: 0.222 ms
Execution Time: 203.987 ms
The number of block I/Os ( shared hit=301534) in the Index Only Scan node tells us that the scalar subquery was executed 100000 times ( loops=100000), which is bad.
Now let's replace the NVL fuction with the COALESCE function and run the query.
SELECT ORD_NO
,CUST_ID
,COALESCE(A.COMMENT,(SELECT MAX(PROD_ID)
FROM ORDERS_DETAIL B WHERE A.ORD_NO = B.ORD_NO))
FROM ORDERS A
WHERE ORD_NO <= 100000;
And here is the query plan it generates:
Index Scan using pk_orders on scott.orders a (cost=0.42..84011.35 rows=98893 width=43) (actual time=0.017..12.163 rows=100000 loops=1)
Output: a.ord_no, a.cust_id, COALESCE(a.comment, ((SubPlan 2))::character varying)
Index Cond: (a.ord_no <= 100000)
Buffers: shared hit=1012
SubPlan 2
-> Result (cost=0.80..0.81 rows=1 width=32) (never executed)
Output: $1
InitPlan 1 (returns $1)
-> Limit (cost=0.43..0.80 rows=1 width=32) (never executed)
Output: ((b.prod_id)::text)
-> Index Only Scan Backward using orders_detail_x01 on
scott.orders_detail b (cost=0.43..4.13 rows=10 width=32) (never executed)
Output: b.prod_id
Index Cond: ((b.ord_no = a.ord_no) AND (b.prod_id IS NOT NULL))
Heap Fetches: 0
Query Identifier: -7418638523580744970
Planning Time: 0.182 ms
Execution Time: 14.226 ms
Note that the orders_detail_x01 index access was not executed. Look at the elapsed time and block IO
differences. The elapsed time dropped from 203 to 13 ms and the block IO
fell from 302546 to 1012. In Oracle you can see the same behavior. When I see this pattern of queries in Oracle, I can improve performance simply by substituting NVL with COALESCE.
There is another reason why you must not use NVL in EDB. The reason is smple:
NVL is less performant than COALESCE.
We can check the performance difference by running the following query:
SELECT ORD_NO
,CUST_ID
,COALESCE(A.COMMENT,'Z') as COMMENT
FROM ORDERS A
WHERE ORD_NO <= 1000000
UNION ALL
SELECT ORD_NO
,CUST_ID
,NVL(A.COMMENT,'Z') as COMMENT
FROM ORDERS A
WHERE ORD_NO <= 1000000;
Append (cost=0.00..72206.00 rows=2000000 width=43) (actual time=0.011..310.744 rows=2000000 loops=1)
Buffers: shared hit=14706
-> Seq Scan on scott.orders a (cost=0.00..19853.00 rows=1000000 width=43) (actual time=0.010..101.993 rows=1000000 loops=1)
Output: a.ord_no, a.cust_id, COALESCE(a.comment, 'Z'::character varying)
Filter: (a.ord_no <= 1000000)
Buffers: shared hit=7353
-> Seq Scan on scott.orders a_1 (cost=0.00..22353.00 rows=1000000 width=43) (actual time=0.018..131.158 rows=1000000 loops=1)
Output: a_1.ord_no, a_1.cust_id, nvl(a_1.comment, 'Z'::character varying)
Filter: (a_1.ord_no <= 1000000)
Buffers: shared hit=7353
Query Identifier: 7906406971189728346
Planning:
Buffers: shared hit=8
Planning Time: 0.155 ms
Execution Time: 353.892 ms
Note
that when we use COALESCE the planner's estimated cost for the upper
branch of the query is 19853, whereas the cost for the lower branch of
the query is 22353. PostgreSQL query planner is predicting that it will
take longer to execute the lower branch of the query using NVL. We can
see that the planner's estimate was correct by checking the actual
execution times of each branch of the query, which are 101.993 ms and
131.158 ms respectively.
This
inefficiency is just the tip of the iceberg. There are a lot of side
effects you have to keep in mind when you use Oracle specific features
supported by EDB. I will provide them in another note.
Conclusion
In Oracle or EDB you should not use NVL. Please get into the habit of using COALESCE regardless of how annoying it is to type.
I
started my Oracle career on version 7.3 and NVL was the only function
available to filter through null expressions. Then Oracle 9i came in
with the COALESCE function. To this day when we use Oracle 21 or
something, I still encounter developers who utilize NVL, which I think
is the reason why EDB decided to implement the feature. The fact that
you have the luxury of being able to use NVL in your non-Oracle database
does not mean that the function is optimal. COALESCE is more
performant. Why keep using the ancient and inefficient function?
Footnote
You can also replace NVL by the CASE WHEN clause like the following:
SELECT ORD_NO
,CUST_ID
,CASE WHEN A.COMMENT IS NOT NULL THEN A.comment
ELSE (SELECT MAX(PROD_ID) FROM ORDERS_DETAIL B WHERE A.ORD_NO = B.ORD_NO) END
FROM ORDERS A
WHERE ORD_NO <= 100000;
The
thing is that the SQL statement with the COALESCE function is slightly
(1~2 %) faster than the SQL statement with the CASE clause and I do not
know the reason.