PostgreSQL 은 특별합니다.
단순한 데이터베이스가 아니라, 하나의 어플리케이션 플랫폼이기 때문입니다.

공지사항

뉴스 및 기사

	총 게시물 177건, 최근 0 건

Speeding Up Subquery with OR 2

글쓴이 : 모델광 날짜 : 2022-03-12 (토) 04:39 조회 : 2799

This is the second example to demonstrate the tuning technique :

"When there are complex OR predicates, rewrite the SQL as UNION ALL queries."

We separate the two parts of the OR predicates to help the optimizer build an efficient plan with ease.

Here are the sample tables and SQL statements to demonstrate the principle above. The following scripts were run under PostgreSQL 13.

CREATE TABLE ords (

ORD_ID INT NOT NULL,

CUST_ID VARCHAR(10) NOT NULL,

ORD_DATE DATE NOT NULL,

ETC_CONTENT VARCHAR(100));

ALTER TABLE ords ADD CONSTRAINT ORDS_PK PRIMARY KEY(ORD_ID);

CREATE INDEX ORDS_X01 ON ORDS (CUST_ID);

INSERT INTO ORDS

SELECT i

,lpad(mod(i,1000)::text,10,'cust')

,date '2021-06-07'+mod(i,624)

,rpad('x',100,'x')

FROM generate_series(1,1000000) a(i);

CREATE TABLE delivery (

ORD_ID INT NOT NULL,

VEHICLE_ID VARCHAR(10) NOT NULL,

START_DATE DATE NOT NULL,

END_DATE DATE NOT NULL,

ETC_REMARKS VARCHAR(100));

INSERT INTO DELIVERY

SELECT i

, MOD(i,1000)

, date '2021-01-01' + mod(i,1000)

, date '2021-01-05' + mod(i,1000)

, rpad('x',100,'x')

FROM generate_series(1,1000000) a(i);

ALTER TABLE DELIVERY ADD CONSTRAINT DELIVERY_PK primary key (ORD_ID);

CREATE INDEX DELIVERY_X01 ON DELIVERY(END_DATE, START_DATE);

CREATE INDEX DELIVERY_X02 ON DELIVERY(VEHICLE_ID);

select pg_relation_size('ords'), pg_relation_size('delivery');

pg_relation_size | pg_relation_size

------------------+------------------

157540352 | 148946944

The following is the SQL statement we are trying to improve the performance of.

SELECT A.*, B.*

FROM ORDS A LEFT JOIN DELIVERY B

ON (A.ORD_ID = B.ORD_ID

AND (B.START_DATE <= DATE '2021-07-12' AND B.END_DATE >= DATE '2021-07-10'

OR (B.VEHICLE_ID > '990')

)

WHERE A.ORD_DATE BETWEEN DATE '2021-06-01' AND DATE '2021-07-10';

Here is the execution plan of the statement.

Gather (actual time=86.645..101.685 rows=54501 loops=1)

Workers Planned: 2

Workers Launched: 2

Buffers: shared hit=13615 read=23995, temp read=1196 written=1272

-> Parallel Hash Left Join (actual time=83.360..87.135 rows=18167 loops=3)

Hash Cond: (a.ord_id = b.ord_id)

Buffers: shared hit=13614 read=23995, temp read=1196 written=1272

-> Parallel Seq Scan on ords a (actual time=0.047..34.335 rows=18167 loops=3)

Filter: ((ord_date >= '2021-06-01'::date) AND (ord_date <= '2021-07-10'::date))

Rows Removed by Filter: 315166

Buffers: shared hit=4968 read=14263

-> Parallel Hash (actual time=42.999..42.999 rows=5333 loops=3)

Buckets: 32768 Batches: 8 Memory Usage: 608kB

Buffers: shared hit=8450 read=9732, temp written=280

-> Parallel Seq Scan on delivery b (actual time=0.069..40.615 rows=5333 loops=3)

Filter: (((start_date <= '2021-07-12'::date) AND (end_date >= '2021-07-10'::date)) OR ((vehicle_id)::text > '990'::text))

Rows Removed by Filter: 328000

Buffers: shared hit=8450 read=9732

Planning:

Buffers: shared hit=20

Planning Time: 0.357 ms

Execution Time: 103.282 ms

The optimizer has let me down here. It didn't access any indexes in the DELIVERY table. The plan shows us that we have to scan all the data from the DELIVERY table before applying the filter condition and discarding about (1000000 - (5333 x 3)) rows. (We can check the Rows Removed by Filter item to see that one of the worker processes discarded 328000 rows.) Depending on the pattern of data in the table, this strategy can be efficient. However, we shouldn't put full trust in the planner's decision. I had expected that two Bitmap Index Scans accessing the delivery_x01 and delivery_x02 would appear followed by the BitmapOr operation when fetching rows from the DELIVERY table. Unlike what I anticipated, the planner chose to do a table scan with the parallelism.

To compare the execution plan I expected with the plan PostgreSQL chose, I set the parameter max_parallel_workers_per_gather to 0 and re-ran the SQL statement.

set max_parallel_workers_per_gather = 0;

--I re-ran the query under investigation and here is the resulting execution plan.

Hash Right Join (actual time=100.080..119.375 rows=54501 loops=1)

Hash Cond: (b.ord_id = a.ord_id)

Buffers: shared hit=3304 read=18847, temp read=903 written=903

-> Bitmap Heap Scan on delivery b (actual time=1.374..4.277 rows=16000 loops=1)

Recheck Cond: (((end_date >= '2021-07-10'::date) AND (start_date <= '2021-07-12'::date)) OR ((ve

hicle_id)::text > '990'::text))

Heap Blocks: exact=2182

Buffers: shared hit=2919

-> BitmapOr (actual time=1.108..1.109 rows=0 loops=1)

Buffers: shared hit=737

-> Bitmap Index Scan on delivery_x01 (actual time=0.809..0.810 rows=7000 loops=1)

Index Cond: ((end_date >= '2021-07-10'::date) AND (start_date <= '2021-07-12'::date))

Buffers: shared hit=726

-> Bitmap Index Scan on delivery_x02 (actual time=0.298..0.298 rows=9000 loops=1)

Index Cond: ((vehicle_id)::text > '990'::text)

Buffers: shared hit=11

-> Hash (actual time=98.373..98.374 rows=54501 loops=1)

Buckets: 32768 Batches: 4 Memory Usage: 2331kB

Buffers: shared hit=384 read=18847, temp written=697

-> Seq Scan on ords a (actual time=0.122..85.072 rows=54501 loops=1)

Filter: ((ord_date >= '2021-06-01'::date) AND (ord_date <= '2021-07-10'::date))

Rows Removed by Filter: 945499

Buffers: shared hit=384 read=18847

Planning:

Buffers: shared hit=12

Planning Time: 0.232 ms

Execution Time: 120.843 ms

By using Bitmap Index Scan and BitmapOr operations we could drop the number of block I/Os, but the execution time increased from 103 ms to 120 ms. The elapsed time gap is due to the parallelization. We can infer that if parallelization works in the BitmapOr operation, we would get the query faster. But unfortunately in the PostgreSQL version I am on, which is 13, Bitmap Index Scan cannot use the parallelism.

Please refer to the following post regarding the bitmap index scan and the parallelism.

https://www.postgresql-archive.org/Parallel-bitmap-index-scan-td6147662.html

As far as I know, there is no workaround to force the planner to use the BitmapOr operation without setting max_parallel_workers_per_gather to 0.

When there is OR in the WHERE clause, the planner tends to prefer a table scan to an index scan.

Now let's apply the tuning principle

"When there are complex OR predicates, rewrite the SQL as UNION ALL queries."

to the SQL statement and re-engineer the code as follows:

SELECT A.*, B.*

FROM ORDS A

LEFT JOIN (SELECT *

FROM DELIVERY

WHERE VEHICLE_ID > '990'

UNION ALL

SELECT *

FROM DELIVERY

WHERE START_DATE <= DATE '2021-07-12' AND END_DATE >= DATE '2021-07-10'

AND COALESCE(NOT(VEHICLE_ID > '990'),TRUE)) B

ON A.ORD_ID = B.ORD_ID

WHERE A.ORD_DATE BETWEEN DATE '2021-06-01' AND DATE '2021-07-10';

I used COALESCE(NOT(VEHICLE_ID > '990'),TRUE) to prevent getting rows twice.

Here is the resulting execution plan.

Hash Right Join (actual time=43.932..64.737 rows=54501 loops=1)

Hash Cond: (delivery.ord_id = a.ord_id)

Buffers: shared hit=7984 read=14167, temp read=903 written=903

-> Append (actual time=0.526..5.993 rows=16000 loops=1)

Buffers: shared hit=2919

-> Bitmap Heap Scan on delivery (actual time=0.526..2.550 rows=9000 loops=1)

Recheck Cond: ((vehicle_id)::text > '990'::text)

Heap Blocks: exact=1091

Buffers: shared hit=1102

-> Bitmap Index Scan on delivery_x02 (actual time=0.437..0.438 rows=9000 loops=1)

Index Cond: ((vehicle_id)::text > '990'::text)

Buffers: shared hit=11

-> Bitmap Heap Scan on delivery delivery_1 (actual time=0.965..2.596 rows=7000 loops=1

Recheck Cond: ((end_date >= '2021-07-10'::date) AND (start_date <= '2021-07-

Filter: COALESCE(((vehicle_id)::text <= '990'::text), true)

Heap Blocks: exact=1091

Buffers: shared hit=1817

-> Bitmap Index Scan on delivery_x01 (actual time=0.874..0.875 rows=7000 loops=1)

Index Cond: ((end_date >= '2021-07-10'::date) AND (start_date <= '2021-07-12'

Buffers: shared hit=726

-> Hash (actual time=43.131..43.182 rows=54501 loops=1)

Buckets: 32768 Batches: 4 Memory Usage: 2331kB

Buffers: shared hit=5064 read=14167, temp written=697

-> Gather (actual time=0.280..31.877 rows=54501 loops=1)

Workers Planned: 2

Workers Launched: 2

Buffers: shared hit=5064 read=14167

-> Parallel Seq Scan on ords a (actual time=0.042..31.570 rows=18167 loops=3)

Filter: ((ord_date >= '2021-06-01'::date) AND (ord_date <= '2021-07-10'::date))

Rows Removed by Filter: 315166

Buffers: shared hit=5064 read=14167

Planning:

Buffers: shared hit=4

Planning Time: 0.215 ms

Execution Time: 66.304 ms

Notice the appearance of Bitmap Index Scan in the above execution plan. By re-engineering the query with UNION ALL, we could get more efficient plan (Block I/Os and execution time dropped). Even though the parallelism doesn't kick in, we get the query faster.

Will this tuning technique always work? No, not necessarily always. If there are not optimal indexes, rewriting the query as UNION ALL queries degrades performance.

Addendum

We might consider using a lateral join in the revised query. PostgreSQL iterates over each of the row of ORDS and runs UNION ALL subquery giving an access to that row. Since there are 54501 rows in the results of ORDS, it doesn't seem to be a good strategy. For your reference, I used a lateral join and presented the execution plan here:

SELECT A.*, B.*

FROM ORDS A

LEFT JOIN LATERAL (SELECT *

FROM DELIVERY B

WHERE VEHICLE_ID > '990'

AND A.ORD_ID = B.ORD_ID

UNION ALL

SELECT *

FROM DELIVERY B

WHERE START_DATE <= DATE '2021-07-12' AND END_DATE >= DATE '2021-07-10'

AND COALESCE(NOT(VEHICLE_ID > '990'),TRUE)

AND A.ORD_ID = B.ORD_ID) B ON TRUE