Most relational databases have an
architecture of the parser, transformer, optimizer and executor and it
works well. Most of your queries will be OK. But there might
be a few queries which do not have optimal excution plans and it only
takes one to ruin the throughput of your application programs.
When the optimizer produces an execution plan, it tries to check almost all possible join orders.
In most cases the default plan chosen by PostgreSQL is optimal but in
some cases the plan is not optimal. Other DBMSs provide hints to allow
an SQL developer to influence the query execution plan but PostgreSQL
does not, which is a serious drawback. When you are certain that the
optimizer made a wrong decision on the join order and you know the best
possible join order, you have to tweak the query to produce the desired execution plan.
Often I hear application developers say, "SQL is a declarative language. We don't have to care about how to retrieve it. If we specify what to retrieve,
the DBMS will take care of it." Theoretically, we don't have to worry
about how to retrieve data, but optimizers don't know the data the way
we know it. That is why sometimes we have to dictate the planner to
retrieve the data in a more efficient way.
In this note I will offer some methods to force the join order in PostgreSQL.
I
am an ardent advocate for taking an evidence-based, scientific, and
methodical approach to query performance problem solving. But to be
honest, I often take the approach of "do whatever comes into my mind and
make guesses until the problem goes away...ㅠㅠ;".
I
will be using a database of PostgreSQL 14 in the example below. Before
we dig into a practical example, it is necessary to create some tables
that we can later use to perform our joins.
CREATE
TABLE employee (
empno numeric(5,0) NOT NULL,
ename character varying(10),
job character varying(9),
mgr numeric(5,0),
hiredate timestamp(0),
sal numeric(7,2),
comm numeric(7,2),
deptno numeric(2,0),
sido_nm character varying(100)
);
I added some rows to the table.
insert
into employee
select
i, chr(65+mod(i,26))||i::text||'NM'
,case when mod(i,10000)=0 then
'PRESIDENT'
when mod(i,1000) = 0 then 'MANAGER'
when mod(i,3)=0 then 'SALESMAN'
when mod(i,3)=1 then 'ANALYST'
when mod(i,3)=2 then 'CLERK'
end as job
,case when mod(i,10000)= 0 then null
when mod(i,1000)= 1 then 10000
when i >= 9000 then 1000
else ceiling((i+1000)/1000)*1000
end as mgr
, current_date - i
, trunc(random() * 10000) as sal
, trunc(random() * 10000) as com
, mod(i,12)+1 as deptno
, case when mod(i,3) = 0 then 'Jeonbuk'
when mod(i,3) = 1 then 'Kangwon'
else 'Chungnam'
end as sido_nm
from
generate_series(1,10000) a(i);
ALTER
TABLE employee ADD CONSTRAINT employee_pk
PRIMARY KEY (empno);
select
pg_relation_size('employee');
0.9Mbytes
drop
table offline_order;
create
table offline_order (
ord_no
numeric(10,0) not null,
cust_no
numeric not null,
ord_date
timestamp(0) not null,
ord_dt varchar(8)
not null,
ord_status_cd
varchar(1) not null,
empno numeric(5,0),
comment varchar(100)
);
insert
into offline_order
select
i, mod(i,1000000) as cust_no
,current_date - mod(i,1000) as ord_date
,to_char((current_date -
mod(i,1000)),'yyyymmdd') as ord_dt
,(mod(i,4) + 1) as ord_status_cd
,mod(i,10000) + 1 as empno
,lpad('y',100,'y')
from generate_series(2,2000000,2) a(i);
alter
table offline_order add constraint offline_order_pk
primary
key (ord_no);
CREATE
INDEX OFFLINE_ORDER_X01 ON OFFLINE_ORDER(CUST_NO);
CREATE
INDEX OFFLINE_ORDER_X02 ON OFFLINE_ORDER(EMPNO);
select
* from pg_relation_size('offline_order');
--174M
create
table ord_item (
ord_no
numeric(10,0) not null,
prod_id
varchar(10) not null,
unit_price
numeric(10) not null,
quantity numeric(10) not null,
on_off_code
varchar(2) not null,
oder_comment
varchar(100)
);
--prod1 ~ prod200
insert
into ord_item
select
a.ord_no, 'prod'||(mod(ord_no,200)+1)::text as prod_id
, trunc(100*random())+1 as unit_price
, trunc(10*random())+1 as quantity
, case when mod(ord_no,2)=0 then '01'
else '02' end as on_off_code
, lpad('c',100,'y')
from
(select ord_no from online_order
union all
select ord_no from offline_order
) a
;
Let us add some data to this table.
insert
into ord_item
select
a.ord_no, 'prod'||(mod(ord_no,200)+2)::text as prod_id
, trunc(100*random())+1 as unit_price
, trunc(10*random())+1 as quantity
, case when mod(ord_no,2)=0 then '01'
else '02' end as on_off_code
, lpad('d',100,'q')
from
(select ord_no from online_order
union all
select ord_no from offline_order
) a
;
alter
table ord_item add constraint ord_item_pk
primary
key(ord_no, prod_id);
select
* from ord_item where ord_no in (998,999);
select
pg_relation_size('ord_item');
--648M
Here is the query we are interested in followed by its execution plan.
* SQL statement 1
select a.ename, a.job, b.ord_date, b.ord_status_cd, c.prod_id, c.quantity
from employee a, offline_order b, ord_item c
where a.empno between 1 and 5
and a.empno = b.empno
and b.ord_no = c.ord_no;
Gather (cost=1008.88..27819.86 rows=940 width=36)
Workers Planned: 2
-> Nested Loop (cost=8.88..26725.86 rows=392 width=36)
-> Hash Join (cost=8.45..26546.33 rows=208 width=30)
Hash Cond: (b.empno = a.empno)
-> Parallel Seq Scan on offline_order b (cost=0.00..25443.67 rows=416667 width=21)
-> Hash (cost=8.39..8.39 rows=5 width=19)
-> Index Scan using employee_pk on employee a (cost=0.29..8.39 rows=5 width=19)
Index Cond: ((empno >= '1'::numeric) AND (empno <= '5'::numeric))
-> Index Scan using ord_item_pk on ord_item c (cost=0.43..0.84 rows=2 width=18)
Index Cond: (ord_no = b.ord_no)
We can see that the join order chosen by the planner is :
(employee -> offline_order) -> ord_item
It seems that the planner made a good decision on the join method and the join order. But let's suppose that the plan is not optimal and you know the right order to join tables efficiently and the better join order is:
employee -> ( offline_order -> ord_item).
In order to produce the better execution plan I used an explicit join.
*SQL statement 2
select a.ename, a.job, b.ord_date, b.ord_status_cd, c.prod_id, c.quantity
from employee a join (offline_order b join ord_item c on (b.ord_no = c.ord_no))
on (a.empno = b.empno)
where a.empno between 1 and 5;
Unlike
what I expected the query above produced the same execution plan.
PostgreSQL doesn't supply its thinking process, so I can't guarantee the
correctness of the description I've give here, but it is probabley quite
accurate. In the SQL statement 2, subquery collapse kicked in and the
optimizer transformed the SQL statement 2 to the SQL statement 1.
In order to prevent subquery collapse I rewrote the query like the following:
*SQL statement 3
select a.ename, a.job, d.ord_date, d.ord_status_cd, d.prod_id, d.quantity
from employee a
,(select b.empno, b.ord_date, b.ord_status_cd, c.prod_id, c.quantity
from offline_order b, ord_item c
where b.ord_no = c.ord_no
offset 0) d
where a.empno = d.empno
and a.empno between 1 and 5;
Hash Join (cost=49645.45..255746.63 rows=940 width=36)
Hash Cond: (b.empno = a.empno)
-> Hash Join (cost=49637.00..232006.28 rows=1879589 width=27)
Hash Cond: (c.ord_no = b.ord_no)
-> Seq Scan on ord_item c (cost=0.00..119133.41 rows=3999941 width=18)
-> Hash (cost=31277.00..31277.00 rows=1000000 width=21)
-> Seq Scan on offline_order b (cost=0.00..31277.00 rows=1000000 width=21)
-> Hash (cost=8.39..8.39 rows=5 width=19)
-> Index Scan using employee_pk on employee a (cost=0.29..8.39 rows=5 width=19)
Index Cond: ((empno >= '1'::numeric) AND (empno <= '5'::numeric))
The join order, as expected, now is :
employee -> ( offline_order -> ord_item)
You may even want to change the join method. Below are some configuration parameters with which you can influence the join method. But I will not go into details here.
enable_hashjoin
enable_mergejoin
enable_nestloop
With regard to reordering of explict JOINs, there is a parameter which controls the join order to some extent. It is join_collapse_limit.
Setting it to 1 prevents the reordering of explicit JOINs. The explicit
join order specified in the query will be the actual order in which the
tables are joined.
I lowered the parameter join_collapse_limit to 1 and ran the SQL statement 2 again and got this execution plan.
*SQL statement 2
set join_collapse_limit=1;
select a.ename, a.job, b.ord_date, b.ord_status_cd, c.prod_id, c.quantity
from employee a join (offline_order b join ord_item c on (b.ord_no = c.ord_no))
on (a.empno = b.empno)
where a.empno between 1 and 5;
Gather (cost=31660.45..133986.49 rows=940 width=36)
Workers Planned: 2
-> Hash Join (cost=30660.45..132892.49 rows=392 width=36)
Hash Cond: (b.empno = a.empno)
-> Parallel Hash Join (cost=30652.00..130827.37 rows=783162 width=27)
Hash Cond: (c.ord_no = b.ord_no)
-> Parallel Seq Scan on ord_item c (cost=0.00..95800.42 rows=1666642 width=18)
-> Parallel Hash (cost=25443.67..25443.67 rows=416667 width=21)
-> Parallel Seq Scan on offline_order b (cost=0.00..25443.67 rows=416667 width=21)
-> Hash (cost=8.39..8.39 rows=5 width=19)
-> Index Scan using employee_pk on employee a (cost=0.29..8.39 rows=5 width=19)
Index Cond: ((empno >= '1'::numeric) AND (empno <= '5'::numeric))
Now
the planner produced the join order we wanted to see. But it is
slightly different from the plan of the SQL statement 3. The planner
launched parallel worker processes. The downside is that you cannot
adopt this technique of setting the parameter join_collapse_limit if you
are using the connection pool architecture.
This is all about how to force the join order.
Wrap-up
When you want to influence the join order you have two options:
1. Write an SQL statement with an explicit join and set the value of the parameter join_collapse_limit to 1.
2. Use an inline view and prevent subquery collapse with OFFSET clause.
Added on Aug 26, 2023
If you want the optimizer to use a nested loop join, because you happen to know that it is the best join strategy, you can deliverately rewrite a regular join to a lateral cross join to force a nested loop. For example, this query
select a.ename, a.job, b.ord_date
from employee a, offline_order b
where a.empno between 1 and 5
and a.empno = b.empno;
is semantically equivalent to
select a.ename, a.job, foo.ord_date
from employee a
cross join lateral
(select ord_date
from offline_order b
where a.empno = b.empno
offset 0) as foo;