In the last article titled "row by row solution vs. set based solution 2" I showed an Oracle procedure which does not have an optimal algorithm.
CREATE OR REPLACE PROCEDURE P_UPDATE_COMMENT(p_ord_dt VARCHAR2 )
IS
v_comment ONLINE_ORDER%TYPE;
CURSOR C1 IS SELECT *
FROM ONLINE_ORDER
WHERE ORD_DT = p_ord_dt
FOR UPDATE;
BEGIN
FOR rec IN C1 LOOP
IF rec.ORD_STATUS_CD = '1' THEN v_comment = 'Received';
END IF;
IF rec.ORD_STATUS_CD = '2' THEN v_comment = 'Confirmed';
END IF;
IF rec.ORD_STATUS_CD = '3' THEN v_comment = 'Cancelled';
END IF;
IF rec.ORD_STATUS_CD = '4' THEN v_comment = 'Ordered';
END IF;
UPDATE ONLINE_ORDER
SET COMMENT = v_comment
WHERE CURRENT OF C1;
END LOOP;
COMMIT;
END;
Let us assume that the row by row algorithm in the procedure is optimal or there is another business logic in the procedure which stops us from using the set based technique.
Now we can see that the application developer cared about data integrity. The developer used SELECT FOR UPDATE statement when
declaring the cursor c1 to control the concurrent access to one or more
rows. The developer prohibitted other transactions from changing the
records in the cursor result set while allocating a value to the
variable V_COMMENT. But sometimes SELECT FOR UPDATE statement could lead to an unecessary locking behavior when there is a child table associated with the parent table.
In most cases we do not change the key value. In the online_order table, no one would try to change the value in the key column ord_no. So If I had to declare a cursor in PostgreSQL, I would write the code like this:
DECLARE c1 CURSOR FOR
SELECT * FROM online_order WHERE ord_dt = p_ord_dt FOR NO KEY UPDATE;
In this note I will explain why we should use the FOR NO KEY UPDATE mode when there is a child table and we do not intend to change the key value.
Let's create a table of employees, a parent table, and a table of offline_order, a child table.
--Due to the volume limitation of this bulletin board I moved the script to the comment section below.
Note that I have created a foreign key constraint.
alter table offline_order add constraint offline_orker_fk foreign key(empno) references employee(empno);
Now open a SESSION 1 and run the following quries.
--session 1
analdb=# begin work;
BEGIN
analdb=*# select * from employee where empno = 2 for update;
empno | ename | job | mgr | hiredate | sal | comm | deptno | sido_nm
-------+-------+-------+------+---------------------+---------+---------+--------+----------
2 | C2NM | CLERK | 1000 | 2022-01-06 00:00:00 | 9383.00 | 7793.00 | 3 | Chungnam
(1 row)
analdb=*# update employee set sal = sal+1 where empno = 2;
UPDATE 1
Now let's take a look at how locks are handled. To do this, we have to do search on the pg_locks view.
select locktype, relation::regclass, virtualxid, transactionid, mode, pid, granted
, pg_blocking_pids(pid)
from pg_locks;
locktype | relation | virtualxid | transactionid | mode | pid | granted | pg_blocking_pids
---------------+-------------+------------+---------------+------------------+------+---------+------------------
relation | employee_pk | | | RowShareLock | 6195 | t | {}
relation | employee_pk | | | RowExclusiveLock | 6195 | t | {}
relation | employee | | | RowShareLock | 6195 | t | {}
relation | employee | | | RowExclusiveLock | 6195 | t | {}
virtualxid | | 5/922 | | ExclusiveLock | 6195 | t | {}
relation | pg_locks | | | AccessShareLock | 6563 | t | {}
virtualxid | | 4/973 | | ExclusiveLock | 6563 | t | {}
transactionid | | | 2045 | ExclusiveLock | 6195 | t | {}
We
can see that session 6195 is holding RowExclusiveLock because we
updated a row and did not commit or rollback the transaction yet.
Now open another SESSION 2 and run the following queries.
--session 2
analdb=# begin work;
BEGIN
analdb=*# insert into offline_order values (11,1,now(),'99991212','1',2,'test');
...wait...
...wait...
We can observe that the insert operation in the child table is in a waiting state. Even though the FOR UPDATE clause doesn't exist in the SESSION 2, the transaction would still remain in a waiting state.
Now let's take a look at what locks are being used.
select locktype, relation::regclass, virtualxid, transactionid, mode, pid, granted
, pg_blocking_pids(pid)
from pg_locks;
locktype | relation | virtualxid | transactionid | mode | pid | granted | pg_blocking_pids
---------------+-------------------+------------+---------------+------------------+------+---------+------------------
relation | employee_pk | | | RowShareLock | 6195 | t | {}
relation | employee_pk | | | RowExclusiveLock | 6195 | t | {}
relation | employee | | | RowShareLock | 6195 | t | {}
relation | employee | | | RowExclusiveLock | 6195 | t | {}
virtualxid | | 5/923 | | ExclusiveLock | 6195 | t | {}
relation | employee_pk | | | RowShareLock | 6372 | t | {6195}
relation | employee | | | RowShareLock | 6372 | t | {6195}
relation | offline_order_x02 | | | RowExclusiveLock | 6372 | t | {6195}
relation | offline_order_x01 | | | RowExclusiveLock | 6372 | t | {6195}
relation | offline_order_pk | | | RowExclusiveLock | 6372 | t | {6195}
relation | offline_order | | | RowExclusiveLock | 6372 | t | {6195}
virtualxid | | 6/1615 | | ExclusiveLock | 6372 | t | {6195}
relation | pg_locks | | | AccessShareLock | 6563 | t | {}
virtualxid | | 4/975 | | ExclusiveLock | 6563 | t | {}
transactionid | | | 2048 | ExclusiveLock | 6195 | t | {}
tuple | employee | | | AccessShareLock | 6372 | t | {6195}
transactionid | | | 2048 | ShareLock | 6372 | f | {6195}
transactionid | | | 2049 | ExclusiveLock | 6372 | t | {6195}
We can observe that the SESSION 2 with the process id of 6372 is trying to get a ShareLock but it didn't get the lock yet (granted = f).
As long as the transaction in the SESSION 1 does not commit or rollback, the transaction (transactionid=2048) in the SESSION 2 cannot complete its activity.
Rollback the transactions in the Session 1 and the Session 2.
Now, let try the same example with the NO KEY option.
--session 1
analdb=# begin work;
BEGIN
analdb=*# select * from employee where empno = 2 for no key update;
empno | ename | job | mgr | hiredate | sal | comm | deptno | sido_nm
-------+-------+-------+------+---------------------+---------+---------+--------+----------
2 | C2NM | CLERK | 1000 | 2022-01-06 00:00:00 | 9383.00 | 7793.00 | 3 | Chungnam
analdb=*# update employee set sal = sal+1 where empno = 2;
UPDATE 1
--session 2
analdb=# begin work;
BEGIN
analdb=*# insert into offline_order values (11,1,now(),'99991212','1',2,'test');
INSERT 0 1 --we can observe that the insert operation is done without any waits.
From the above results, the INSERT operation in the SESSION 2 was much smooth and has completed its activity without any waits.
Conclusion
SELECT ~ FOR UPDATE statement assumes a total change of the rows of the result set. SELECT ~ FOR NO KEY UPDATE statement assumes a change only to the columns that are not involved in unique indexes, in other words this change does not affect foreign keys.
we should use the FOR NO KEY UPDATE mode when there is a child table and we do not intend to change the key value.
Addendum
If
the child table offline_order doesn't have a foreign key constraint,
you don't have to care about using SELECT ~ FOR NO KEY UPDATE statement. So some
developers who went through performance degradation due to constraints
object to setting foreing key constraints. Performance, however, is not
the only god to worshhip. It is clear that the younger generation of
developers for some reason consider their code fairly reliable, but let
me quote Mike Tyson: "Everybody has a plan until they get punched in the
mouth." Do not sacrifice data integrity for performance.