A recent video by Connor McDonald prompted me to write this note.
https://www.youtube.com/watch?v=n9gP_H746kk
In
the video he demonstrated how a read-only table can have a much higher
full percentage even though the storage parameter PCTFREE is set to 10.
He explained that it is part of the implemenation of the Oracle database
that the first insert into a block Oracle almost ignores PCTFREE
because it would make no sense to immediately move that row to somewhere
else and leave that block empty.
I see Oracle default value for PCTFREE is 10 for tables and indexes, whereas PostgreSQL chooses the FILLFACTOR value as 100 and 90 respectively for tables and indexes.
Connor
McDonald says he has always been a big fan of the default for PCTFREE
should be 1 not 10. My rule of thumb for PCTFREE is 5 and 0 respectively
for tables and indexes.
Regarding
the FILLFACTOR value in PostgreSQL, I have always set the fillfactor
value to 95 for both tables and indexes in an OLTP system. The reason
was that I have read many blog posts which recommend that we should decrease the fillfactor value "a bit" to improve performance. The following are some of those articles:
https://www.cybertec-postgresql.com/en/what-is-fillfactor-and-how-does-it-affect-postgresql-performance/
https://www.slideshare.net/SiyeonAcademy/postgresql-20178 (refer to page 41.)
There
seems to be no rigorous rule with regard to setting the parameter
value. So I have decided to run up some test scripts to check whether
reducing the fillfactor value of tables really gives us some performance
improvement.
I have created two tables, one with a reduced FF value of 90 and the other with the default FF value of 100.
These scripts were run on PostgreSQL 14.
drop table t1;
create table t1(c1 bigint) with (fillfactor = 90);
create index t1_x01 on t1 (c1) with (fillfactor = 90);
drop table t2;
create table t2(c1 bigint) with (fillfactor = 100);
create index t2_x01 on t2 (c1) with (fillfactor = 90);
insert into t1 select i from generate_series(1,10000000) a(i); --7 sec
insert into t2 select i from generate_series(1,10000000) a(i);
analyze t1;
analyze t2;
SELECT relname, relpages*8/1024 as size_MB, reltuples, reloptions
FROM pg_class WHERE relname like 't1%' or relname like 't2%';
relname | size_mb | reltuples | reloptions
---------+---------+--------------+------------------
t1 | 382 | 9.999949e+06 | {fillfactor=90}
t1_x01 | 214 | 9.999949e+06 | {fillfactor=90}
t2 | 345 | 1e+07 | {fillfactor=100}
t2_x01 | 214 | 1e+07 | {fillfactor=90}
As expected the size of the table t1 is bigger. Unlike Oracle, we can see that PostgreSQL doesn't ignore the FILLFACTOR value in the first INSERT into a page.
And
I updated 10% of the rows of those two tables and checked the elapsed
time and the size of those tables. All I intend to do here is show you the relative impact of the fillfactor value.
do $$
DECLARE
rec record;
BEGIN
FOR rec IN select c1 from t1 where mod(c1,10)= 0 order by random() LOOP
begin
update t1 set c1=c1+1000000000 where c1 = rec.c1;
commit;
end;
END LOOP;
END;
$$
DO
Time: 107232.715 ms (01:47.233)
do $$
DECLARE
rec record;
BEGIN
FOR rec IN select c1 from t2 where mod(c1,10)= 0 order by random() LOOP
begin
update t2 set c1=c1+1000000000 where c1 = rec.c1;
commit;
end;
END LOOP;
END;
$$
DO
Time: 107861.746 ms (01:47.862)
You will notice that there is no difference in the elapsed time between the tables with the fillfactor value of 90 and 100
respectively. Theoretically when we update rows in the table t2, new
blocks are added to the table, which is detrimental to performance. But
the reality is that the side effect is negligible.
Now let us check how much the table size has changed.
analyze t1;
analyze t2;
analdb=# SELECT relname, relpages*8/1024 as size_MB, reltuples, reloptions
FROM pg_class WHERE relname like 't1%' or relname like 't2%';
relname | size_mb | reltuples | reloptions
---------+---------+--------------+------------------
t1 | 382 | 9.999949e+06 | {fillfactor=90}
t1_x01 | 263 | 9.999949e+06 | {fillfactor=90}
t2 | 356 | 9.998703e+06 | {fillfactor=100}
t2_x01 | 262 | 9.998703e+06 | {fillfactor=90}
Note that the size of the table t1 did not change at all, whereas
the size of the table t2 increased slightly (345 => 356).
After all, we can't see any benefits of setting the fillfactor value for a table to 90.
Conclusion
I can't guarantee the validity of the test methodology I have used here, but I think it is quite reliable. Given the test above, the default fillfactor value of 100 for a table is the best choice even in a table with many UPDATE operations.
Footnote
Doing the experiment above, I was a little puzzled. I had expected since I updated about 10% of the whole rows of the table t2 (fillfactor=100) the size of the table t2 would rise to 379 M(=345+345*0.1). But it didn't increase that much. After googling, I came to a conclusion that in-page vacuum kicked in while I was doing the exepriment. In-page vacuum is not covered in the PostgreSQL documentation. If you are interested in what it is, refer to this blog:
https://postgrespro.com/blog/pgsql/5967910
Addendum
In
Oracle I usually set the storage parameter PCTFREE for an index to 0
because it is rare to create an index on a column of which the value
changes. However, in PostgreSQL even if the value on an indexed column
doesn't change, the ctid( or a locator to the heap) value of an index
can be modified. That is why we must not set the
fillfactor for an index to 100. Surely, if the table is read-only, it is a good idea to set the fillfactor for an index to 100.