PostgreSQL 은 특별합니다.
단순한 데이터베이스가 아니라, 하나의 어플리케이션 플랫폼이기 때문입니다.

공지사항

뉴스 및 기사

	총 게시물 176건, 최근 0 건

for update vs. for no key update

글쓴이 : 모델광 날짜 : 2022-03-19 (토) 06:58 조회 : 7870

In the last article titled "row by row solution vs. set based solution 2" I showed an Oracle procedure which does not have an optimal algorithm.

CREATE OR REPLACE PROCEDURE P_UPDATE_COMMENT(p_ord_dt   VARCHAR2 )
IS
v_comment      ONLINE_ORDER%TYPE;
CURSOR C1 IS SELECT *
                             FROM ONLINE_ORDER
                           WHERE ORD_DT = p_ord_dt
                         FOR UPDATE;
BEGIN
FOR rec IN C1 LOOP
    IF rec.ORD_STATUS_CD = '1' THEN v_comment = 'Received';
    END IF;
    IF rec.ORD_STATUS_CD = '2' THEN v_comment = 'Confirmed';
    END IF;
    IF rec.ORD_STATUS_CD = '3' THEN v_comment = 'Cancelled';
    END IF;
    IF rec.ORD_STATUS_CD = '4' THEN v_comment = 'Ordered';
    END IF;

    UPDATE ONLINE_ORDER
         SET COMMENT = v_comment
    WHERE CURRENT OF C1;
END LOOP;
COMMIT;
END;

Let us assume that the row by row algorithm in the procedure is optimal or there is another business logic in the procedure which stops us from using the set based technique.
Now we can see that the application developer cared about data integrity. The developer used SELECT FOR UPDATE statement when declaring the cursor c1 to control the concurrent access to one or more rows. The developer prohibitted other transactions from changing the records in the cursor result set while allocating a value to the variable V_COMMENT. But sometimes SELECT FOR UPDATE statement could lead to an unecessary locking behavior when there is a child table associated with the parent table.
In most cases we do not change the key value. In the online_order table, no one would try to change the value in the key column ord_no. So If I had to declare a cursor in PostgreSQL, I would write the code like this:

DECLARE c1 CURSOR FOR

SELECT * FROM online_order WHERE ord_dt = p_ord_dt FOR NO KEY UPDATE;

In this note I will explain why we should use the FOR NO KEY UPDATE mode when there is a child table and we do not intend to change the key value.

Let's create a table of employees, a parent table, and a table of offline_order, a child table.

--Due to the volume limitation of this bulletin board I moved the script to the comment section below.

Note that I have created a foreign key constraint.

alter table offline_order add constraint offline_orker_fk foreign key(empno) references employee(empno);

Now open a SESSION 1 and run the following quries.

--session 1
analdb=# begin work;
BEGIN
analdb=*# select * from employee where empno = 2 for update;
empno | ename | job | mgr | hiredate | sal | comm | deptno | sido_nm
-------+-------+-------+------+---------------------+---------+---------+--------+----------
2 | C2NM | CLERK | 1000 | 2022-01-06 00:00:00 | 9383.00 | 7793.00 | 3 | Chungnam
(1 row)

analdb=*# update employee set sal = sal+1 where empno = 2;
UPDATE 1

Now let's take a look at how locks are handled. To do this, we have to do search on the pg_locks view.

select locktype, relation::regclass, virtualxid, transactionid, mode, pid, granted

, pg_blocking_pids(pid)
from pg_locks;

   locktype    | relation   | virtualxid | transactionid |   mode       | pid | granted | pg_blocking_pids
---------------+-------------+------------+---------------+------------------+------+---------+------------------
relation      | employee_pk |            |               | RowShareLock     | 6195 | t       | {}
relation      | employee_pk |            |               | RowExclusiveLock | 6195 | t       | {}
relation      | employee    |            |               | RowShareLock     | 6195 | t       | {}
relation      | employee    |            |               | RowExclusiveLock | 6195 | t       | {}
virtualxid    |             | 5/922      |               | ExclusiveLock    | 6195 | t       | {}
relation      | pg_locks    |            |               | AccessShareLock | 6563 | t       | {}
virtualxid    |             | 4/973      |               | ExclusiveLock    | 6563 | t       | {}
transactionid |             |            |          2045 | ExclusiveLock    | 6195 | t       | {}

We can see that session 6195 is holding RowExclusiveLock because we updated a row and did not commit or rollback the transaction yet.

Now open another SESSION 2 and run the following queries.
--session 2
analdb=# begin work;
BEGIN
analdb=*# insert into offline_order values (11,1,now(),'99991212','1',2,'test');
...wait...
...wait...

We can observe that the insert operation in the child table is in a waiting state. Even though the FOR UPDATE clause doesn't exist in the SESSION 2, the transaction would still remain in a waiting state.
Now let's take a look at what locks are being used.

select locktype, relation::regclass, virtualxid, transactionid, mode, pid, granted

       , pg_blocking_pids(pid)
from pg_locks;
   locktype    |     relation      | virtualxid | transactionid | mode   | pid | granted | pg_blocking_pids
---------------+-------------------+------------+---------------+------------------+------+---------+------------------
relation      | employee_pk       |            |               | RowShareLock     | 6195 | t       | {}
relation      | employee_pk       |            |               | RowExclusiveLock | 6195 | t       | {}
relation      | employee          |            |               | RowShareLock     | 6195 | t       | {}
relation      | employee          |            |               | RowExclusiveLock | 6195 | t       | {}
virtualxid    |                   | 5/923      |               | ExclusiveLock    | 6195 | t       | {}
relation      | employee_pk       |            |               | RowShareLock     | 6372 | t       | {6195}
relation      | employee          |            |               | RowShareLock     | 6372 | t       | {6195}
relation      | offline_order_x02 |            |               | RowExclusiveLock | 6372 | t       | {6195}
relation      | offline_order_x01 |            |               | RowExclusiveLock | 6372 | t       | {6195}
relation      | offline_order_pk |            |               | RowExclusiveLock | 6372 | t       | {6195}
relation      | offline_order     |            |               | RowExclusiveLock | 6372 | t       | {6195}
virtualxid    |                   | 6/1615     |               | ExclusiveLock    | 6372 | t       | {6195}
relation      | pg_locks          |            |               | AccessShareLock | 6563 | t       | {}
virtualxid    |                   | 4/975      |               | ExclusiveLock    | 6563 | t       | {}
transactionid |                   |            |          2048 | ExclusiveLock    | 6195 | t       | {}
tuple         | employee          |            |               | AccessShareLock | 6372 | t       | {6195}
transactionid |                   |            |          2048 | ShareLock        | 6372 | f       | {6195}
transactionid |                   |            |          2049 | ExclusiveLock    | 6372 | t       | {6195}

We can observe that the SESSION 2 with the process id of 6372 is trying to get a ShareLock but it didn't get the lock yet (granted = f).
As long as the transaction in the SESSION 1 does not commit or rollback, the transaction (transactionid=2048) in the SESSION 2 cannot complete its activity.

Rollback the transactions in the Session 1 and the Session 2.
Now, let try the same example with the NO KEY option.

--session 1
analdb=# begin work;
BEGIN
analdb=*# select * from employee where empno = 2 for no key update;
empno | ename | job | mgr |      hiredate       |   sal   | comm   | deptno | sido_nm
-------+-------+-------+------+---------------------+---------+---------+--------+----------
     2 | C2NM | CLERK | 1000 | 2022-01-06 00:00:00 | 9383.00 | 7793.00 |      3 | Chungnam
analdb=*# update employee set sal = sal+1 where empno = 2;
UPDATE 1

--session 2
analdb=# begin work;
BEGIN
analdb=*# insert into offline_order values (11,1,now(),'99991212','1',2,'test');
INSERT 0 1 --we can observe that the insert operation is done without any waits.

From the above results, the INSERT operation in the SESSION 2 was much smooth and has completed its activity without any waits.

Conclusion
SELECT ~ FOR UPDATE statement assumes a total change of the rows of the result set. SELECT ~ FOR NO KEY UPDATE statement assumes a change only to the columns that are not involved in unique indexes, in other words this change does not affect foreign keys.
we should use the FOR NO KEY UPDATE mode when there is a child table and we do not intend to change the key value.

Addendum

If the child table offline_order doesn't have a foreign key constraint, you don't have to care about using SELECT ~ FOR NO KEY UPDATE statement. So some developers who went through performance degradation due to constraints object to setting foreing key constraints. Performance, however, is not the only god to worshhip. It is clear that the younger generation of developers for some reason consider their code fairly reliable, but let me quote Mike Tyson: "Everybody has a plan until they get punched in the mouth." Do not sacrifice data integrity for performance.

모델광

2022-03-19 (토) 07:02

CREATE TABLE employee (
empno numeric(5,0) NOT NULL,
ename character varying(10),
job character varying(9),
mgr numeric(5,0),
hiredate timestamp(0),
sal numeric(7,2),
comm numeric(7,2),
deptno numeric(2,0),
sido_nm character varying(100)
);
insert into employee
select i, chr(65+mod(i,26))||i::text||'NM'
,case when mod(i,10000)=0 then 'PRESIDENT'
when mod(i,1000) = 0 then 'MANAGER'
when mod(i,3)=0 then 'SALESMAN'
when mod(i,3)=1 then 'ANALYST'
when mod(i,3)=2 then 'CLERK'
end as job
,case when mod(i,10000)= 0 then null
when mod(i,1000)= 1 then 10000
when i >= 9000 then 1000
else ceiling((i+1000)/1000)*1000
end as mgr
, current_date - i
, trunc(random() * 10000) as sal
, trunc(random() * 10000) as com
, mod(i,12)+1 as deptno
, case when mod(i,3) = 0 then 'Jeonbuk'
when mod(i,3) = 1 then 'Kangwon'
else 'Chungnam'
end as sido_nm
from generate_series(1,10000) a(i);

ALTER TABLE employee ADD CONSTRAINT employee_pk
PRIMARY KEY (empno);

select pg_relation_size('employee');
0.9Mbytes

drop table offline_order;
create table offline_order (
ord_no numeric(10,0) not null,
cust_no numeric not null,
ord_date timestamp(0) not null,
ord_dt varchar(8) not null,
ord_status_cd varchar(1) not null,
empno numeric(5,0),
comment varchar(100)
);
insert into offline_order
select i, mod(i,1000000) as cust_no
,current_date - mod(i,1000) as ord_date
,to_char((current_date - mod(i,1000)),'yyyymmdd') as ord_dt
,(mod(i,4) + 1) as ord_status_cd
,mod(i,10000) + 1 as empno
,lpad('y',100,'y')
from generate_series(2,2000000,2) a(i);
alter table offline_order add constraint offline_order_pk
primary key (ord_no);
CREATE INDEX OFFLINE_ORDER_X01 ON OFFLINE_ORDER(CUST_NO);
CREATE INDEX OFFLINE_ORDER_X02 ON OFFLINE_ORDER(EMPNO);
select * from pg_relation_size('offline_order');
--174M

댓글주소

CREATE TABLE employee (
    empno numeric(5,0) NOT NULL,
    ename character varying(10),
    job character varying(9),
    mgr numeric(5,0),
    hiredate timestamp(0),
    sal numeric(7,2),
    comm numeric(7,2),
    deptno numeric(2,0),
    sido_nm character varying(100)
);
insert into employee
select i, chr(65+mod(i,26))||i::text||'NM'
      ,case when mod(i,10000)=0 then 'PRESIDENT'
            when mod(i,1000) = 0 then 'MANAGER'
            when mod(i,3)=0 then 'SALESMAN'
            when mod(i,3)=1 then 'ANALYST'
            when mod(i,3)=2 then 'CLERK'
        end as job
      ,case when mod(i,10000)= 0 then null
            when mod(i,1000)= 1 then 10000
            when i >= 9000 then 1000
            else ceiling((i+1000)/1000)*1000
        end as mgr
      , current_date - i
      , trunc(random() * 10000) as sal
      , trunc(random() * 10000) as com
      , mod(i,12)+1             as deptno
      , case when mod(i,3) = 0 then 'Jeonbuk'
             when mod(i,3) = 1 then 'Kangwon'
             else                   'Chungnam'
        end as sido_nm
from generate_series(1,10000) a(i);
 
                          
ALTER TABLE employee ADD CONSTRAINT employee_pk
  PRIMARY KEY (empno);
 
select pg_relation_size('employee');
0.9Mbytes

drop table offline_order;
create table offline_order (
ord_no numeric(10,0) not null,
cust_no numeric      not null,
ord_date timestamp(0) not null,
ord_dt   varchar(8)   not null,
ord_status_cd varchar(1) not null,
empno    numeric(5,0),
comment  varchar(100)
);
insert into offline_order
select i, mod(i,1000000) as cust_no
      ,current_date - mod(i,1000) as ord_date
      ,to_char((current_date - mod(i,1000)),'yyyymmdd') as ord_dt
      ,(mod(i,4) + 1) as ord_status_cd
      ,mod(i,10000) + 1 as empno
      ,lpad('y',100,'y')
  from generate_series(2,2000000,2) a(i);
alter table offline_order add constraint offline_order_pk
primary key (ord_no);
CREATE INDEX OFFLINE_ORDER_X01 ON OFFLINE_ORDER(CUST_NO);
CREATE INDEX OFFLINE_ORDER_X02 ON OFFLINE_ORDER(EMPNO);
select * from pg_relation_size('offline_order');
--174M

E-MAIL: root@postgresdba.com