설문조사
PostgreSQL/PPAS 관련 듣고 싶은 교육은


총 게시물 169건, 최근 0 건
   

PGLZ vs. LZ4

글쓴이 : 모델광 날짜 : 2022-05-21 (토) 20:55 조회 : 2660

In PostgreSQL 14, two compression algorithms are supported: pglz and lz4. And the default value of the parameter default_toast_compression is still pglz. I can't explain why the default value is still pglz, which is not performant.
In this note I will show you why we should set the parameter default_toast_compression to LZ4. 


I will start with a little scripts to demonstrate a point about column compressioin.
I have created two tables. One has a column with the compression algorithm of PGLZ. The other has a column with the compression algorithm of LZ4.

DROP TABLE T2;
CREATE TABLE T2(
ID INT,
C1 TEXT COMPRESSION pglz)
;
DROP TABLE T3;
CREATE TABLE T3(
ID INT,
C1 TEXT COMPRESSION lz4)
;
analdb=# \d+ t2
                                            Table "public.t2"
 Column |  Type   | Collation | Nullable | Default | Storage  | Compression | Stats target | Description
--------+---------+-----------+----------+---------+----------+-------------+--------------+-------------
 id     | integer |           |          |         | plain    |             |              |
 c1     | text    |           |          |         | extended | pglz        |              |
Access method: heap

analdb=# \d+ t3
                                            Table "public.t3"
 Column |  Type   | Collation | Nullable | Default | Storage  | Compression | Stats target | Description
--------+---------+-----------+----------+---------+----------+-------------+--------------+-------------
 id     | integer |           |          |         | plain    |             |              |
 c1     | text    |           |          |         | extended | lz4         |              |
Access method: heap

I inserted 100,000 rows into those two tables to check the speed of the insert operation.

analdb=# \timing
Timing is on.
analdb=# INSERT INTO T2 SELECT i, repeat(i::varchar||'kkk',1000) FROM generate_series(1,100000) a(i);
INSERT 0 100000
Time: 4158.828 ms (00:04.159)
analdb=# INSERT INTO T3 SELECT i, repeat(i::varchar||'kkk',1000) FROM generate_series(1,100000) a(i);
INSERT 0 100000
Time: 767.550 ms

analdb=# select id, pg_column_compression(c1) from t2 limit 1;
 id | pg_column_compression
----+-----------------------
  1 | pglz
analdb=# select id, pg_column_compression(c1) from t3 limit 1;
 id | pg_column_compression
----+-----------------------
  1 | lz4

As you can see, the time spent on data insertion increases significantly when using PGLZ.  LZ4 compression only takes about 1/7 of the time spent by PGLZ.

I checked the size of those two tables.

analdb=# select relname, relpages, relpages*8/1024 size_MB, reltuples, reltoastrelid, pg_relation_filepath(reltoastrelid)
analdb-#   from pg_class where relname in ('t2', 't3');
 relname | relpages | size_mb | reltuples | reltoastrelid | pg_relation_filepath
---------+----------+---------+-----------+---------------+----------------------
 t3      |     1126 |       |    100000 |         16795 | base/16388/16801
 t2      |     1799 |      14 |    100000 |         16790 | base/16388/16804

We can see that the compression ratio of LZ4 is better.

I checked the speed of update operation.

analdb=# update t2
   set c1 =  repeat('abc',1000);
UPDATE 100000
Time: 1654.965 ms (00:01.655)

analdb=# update t3
   set c1 =  repeat('abc',1000);
UPDATE 100000
Time: 448.726 ms 


We can see that it takes longer to update a column with the compression algorithm PGLZ.


Conclusion
Compared with PGLZ, LZ4 is more efficient. I highly recommend you should set the parameter default_toast_compression to LZ4.


PostgresDBA 2022-05-21 (토) 22:52
엄지척. 7월에 뵈요~
댓글주소
   

postgresdba.com