PostgreSQL 은 특별합니다.
단순한 데이터베이스가 아니라, 하나의 어플리케이션 플랫폼이기 때문입니다.

공지사항

뉴스 및 기사

	총 게시물 176건, 최근 0 건

Table Design and Disk I/O

글쓴이 : 모델광 날짜 : 2023-11-12 (일) 17:34 조회 : 1671

The following blog post prompted me to write this note.

Understanding Postgres IOPS: Why They Matter Even When Everything Fits in Cache (crunchydata.com)

In the bottom section (Improve your schema design) of the post, the writer tells us that proper data design can dramatically reduce the amount of data that has to be written out to the table and to the WAL. According to the writer, concerning the amount of data to be written to DISK, the following design is bad:

In the above model, every UPDATE to the "last login" timestamp will copy all column values that are on the same row, and those column values all get written into the WAL too, even though none of them are chaning.

Insead, you shoud design the table like this:

In the revised model, we can minimize the amount of data being written into the table and WAL.

As a database practitioner, we need to be skeptical about the theory, so I have decided to verify the assertion made in the blog post.

I have performed the experiment in PostgreSQL 16.0.

CREATE TABLE tb_users (
    user_no   int     NOT NULL,
    user_nm   character varying(10) NOT NULL,
    job       character varying(9),
    regist_dt date    NOT NULL,
    last_login_dt timestamp,
    dummy     text
);
ALTER TABLE tb_users ADD CONSTRAINT tb_users_pk
PRIMARY KEY (user_no);

create table tb_users_h (
    user_no   int     NOT NULL,
    last_login_dt timestamp
);

alter table tb_users_h add constraint tb_users_h_pk
primary key (user_no, last_login_dt);

insert into tb_users
select i, chr(65+mod(i,26))||i::text||'NM'
      ,case when mod(i,10000)=0 then 'PRESIDENT'
            when mod(i,1000) = 0 then 'MANAGER'
            when mod(i,3)=0 then 'SALESMAN'
            when mod(i,3)=1 then 'ANALYST'
            when mod(i,3)=2 then 'CLERK'
        end as job
      , current_date - random_normal(5, 10)::int
      , null
      , case when mod(i,3) = 0 then 'Jeonbuk'
            when mod(i,3) = 1 then 'Kangwon'
            else                  'Chungnam'
        end||repeat('_k',100)
from generate_series(1,100000) a(i);

alter table scott.tb_users set (autovacuum_enabled=false);
alter table scott.tb_users_h set (autovacuum_enabled=false);

All I have done is to create TB_USERS and TB_USERS_H tables and then insert 100,000 rows into the TB_USERS table. I have prevented autovacumm for those two tables for the purpose of checking the WAL size incurred by the insert or update operations.

Now I have created a procedure to update 1000 rows in the TB_USERS table.

create or replace procedure updateUser()
AS $$
BEGIN
FOR id IN 1..1000 LOOP
   update scott.tb_users
      set last_login_dt = current_timestamp
    where user_no = id;
    commit;
END LOOP;
END;
$$ language plpgsql;

I have also created a procedure to insert 1000 rows in the TB_USERS_H table.

create or replace procedure insertUser()
AS $$
BEGIN
FOR id IN 1..1000 LOOP
   insert into scott.tb_users_h
   select user_no
        , current_timestamp
    from tb_users s
   where s.user_no = id;
   commit;
END LOOP;
END;
$$ language plpgsql;

The purpose of these two procedures above is to check the WAL size respectively incurred when executing each procedure.

Now we are ready to measure the WAL size for those two tables.
I have issued the following command in the psql command mode:

select *,
       pg_current_wal_lsn()
from pg_stat_database
where datname=current_database() \gset

Most of the statistics come from pg_stat_database. The WAL size is calculated from the latest WAL write pointer exposed with pg_current_wal_lsn() and the size calculated with pg_wal_lsn_diff(). I have used \gset to get them as plsql substitution variables before and used them to calculate the difference afterward.

I have run the following command:

call updateUser();

As soon as the procedure completed, I have run the following command to check the statistics.

select blks_hit-:blks_hit "blk hit",
       blks_read-:blks_read "blk read",
       tup_inserted-:tup_inserted "ins",
       tup_updated-:tup_updated "upd",
       tup_deleted-:tup_deleted "del",
       tup_returned-:tup_returned "tup ret",
       tup_fetched-:tup_fetched "tup fch",
       xact_commit-:xact_commit "commit",
       xact_rollback-:xact_rollback "rbk",
       pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), :'pg_current_wal_lsn')) "WAL",
       pg_size_pretty(temp_bytes-:temp_bytes) "temp"
from pg_stat_database

where datname=current_database();

The result of the query is as follows:

blk hit | blk read | ins | upd | del | tup ret | tup fch | commit | rbk | WAL | temp
---------+----------+-----+------+-----+---------+---------+--------+-----+--------+---------
6345 | 2 | 0 | 1000 | 0 | 1684 | 1004 | 1002 | 0 | 512 kB | 0 bytes

We can observe that the execution of updateUser() resulted in 1000 updates(upd = 1000) and 512 kB of WAL size.

Now let's check the statistics incurred by calling the insertUser procecure.
I have run the following command in the psql command-line mode to get plsql substitution variables:

select *, pg_current_wal_lsn()
from pg_stat_database
where datname=current_database() \gset

Then I have called the procedure.

call insertUser();

After that, I have run the following command to check the statistics.

select blks_hit-:blks_hit "blk hit",
       blks_read-:blks_read "blk read",
       tup_inserted-:tup_inserted "ins",
       tup_updated-:tup_updated "upd",
       tup_deleted-:tup_deleted "del",
       tup_returned-:tup_returned "tup ret",
       tup_fetched-:tup_fetched "tup fch",
       xact_commit-:xact_commit "commit",
       xact_rollback-:xact_rollback "rbk",
       pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), :'pg_current_wal_lsn')) "WAL",
       pg_size_pretty(temp_bytes-:temp_bytes) "temp"
from pg_stat_database
where datname=current_database();

The result of the query is as follows:

blk hit | blk read | ins | upd | del | tup ret | tup fch | commit | rbk | WAL   | temp
---------+----------+------+-----+-----+---------+---------+--------+-----+--------+---------

7634 | 0 | 1000 | 0 | 0 | 1609 | 1000 | 1002 | 0 | 208 kB | 0 bytes

Note that the WAL size dropped from 512 to 208 kB, which is a 50% benefit. The gap was caused by the row length of the two tables.

Conclusion
Updates in PostgreSQL incur more block I/O than inserts. The amount of DISK I/O is the culprit for high costs in a cloud system. If you want to reduce DISK I/O, you should avoid designing a table which prompts many update operations. It can be much better to have an additional table that minimizes the amount of data being written as part of any transaction.

Footnote
In the example above, there was only one index on the tb_users table. If there had been more indexes on the table, the size of WAL generated by the update operation would have increased even more.

E-MAIL: root@postgresdba.com