설문조사
PostgreSQL/PPAS 관련 듣고 싶은 교육은


총 게시물 178건, 최근 0 건
 

The Price of Partitioning 2

글쓴이 : 모델광 날짜 : 2025-08-09 (토) 07:43 조회 : 159
Last year, I published an article describing the pitfall of table partitioning:
This note revisits that topic, highlighting the same drawback that emerged from a recent discusstion.

The Quiz
Recently, on the ProDBA cafe forum, a PostgreSQL quiz was posted:
(Subscription to the cafe is necessary to read the quiz.)

For those who are not willing to sign up for the cafe, here is the scenario:
- There is a table partitioned by REGISTER_DT, of which the primary key is REGISTER_DT and SEQ_NO.
- The total number of partitions is approximately 6,000.
- A query is executed with the condition cust_nm = 'x'.
- PostgreSQL does not support global indexes, so the CUST_NM column index must be local to each partition.
- Executing the query WHERE cust_nm = 'x' takes around 30 seconds in planning time alone.

The question: How would you work around this long parsing time?

When I first read the quiz, I thought there was no way to reduce the parsing time.
But the quiz poster suggested creating a non-partitioned table containing only (CUST_NM, REGISTER_DT, SEQ_NO) to speed up lookups. The non-partitioned table would act as a global index.

Intrigued by this idea, I decided to verify the poster's idea.
Below is the script I used to create the test table and insert some data:

--I have created a non-partitioned table to compare its parsing time with partitioned table's.
 create table non_t1 (
register_dt     varchar(8),
seq_no         int       not null,
cust_nm        character varying(100),
register_date  timestamp(0),
cust_status_cd varchar(1),
register_channel_cd varchar(1),
cust_age        smallint,
active_yn       boolean,
sigungu_cd     varchar(5),
sido_cd          varchar(2),
constraint non_t1_pk primary key(register_dt, seq_no)
);

insert into non_t1
select to_char(('20240615'::date - mod(i,10000)),'yyyymmdd') as register_dt     
      ,i as seq_no
     , chr(65+mod(i,26))||i::text||'CUST_NM'
     , '20240615'::date - mod(i,10000) 
     , mod(i,5)+1 as cust_status_cd
     , mod(i,2)+1 as register_channel_cd
     , trunc(random() * 100) +1 as age
     , case when mod(i,22) = 0 then false else true end as active_yn
     , case when mod(i,1000) = 0 then '11007'
            when mod(i,1000) = 1 then '11006'
            when mod(i,1000) = 2 then '11005'
            when mod(i,1000) = 3 then '11004'
            when mod(i,1000) = 4 then '11003'
            when mod(i,1000) = 5 then '11002'
            else '11001' end                  as sigungu_cd
      , case when mod(i,4) = 0 then '01'
             when mod(i,4) = 1 then '02'
             else                   '03' end as sido_cd
  from generate_series(1,1000000) a(i);


The min and max dates from the table non_t1 are:

select min(register_dt), max(register_dt) from customer;
min     |max     |
--------+--------+
19970129|20240615|


To partition this data by each day, we'd create partitions like this:

--I have created a partitioned table
set default_tablespace to '';

 create table t1 (
register_dt     varchar(8),
seq_no         int       not null,
cust_nm        character varying(100),
register_date  timestamp(0),
cust_status_cd varchar(1),
register_channel_cd varchar(1),
cust_age        smallint,
active_yn       boolean,
sigungu_cd     varchar(5),
sido_cd          varchar(2),
constraint t1_pk primary key(register_dt, seq_no)
)
partition by range(register_dt);


The following query creates a script to create the table as a partition of the t1 table.

select 'create table t1_p_'||to_char('19970129'::date+i,'yyyymmdd')||
        ' partition of t1 for values from ('''||to_char('19970129'::date+i -1,'yyyymmdd')||''||
        ''') to ('''||to_char('19970129'::date+i,'yyyymmdd')||''');'
  from generate_series(1, 10000) a(i);


The output of the above query is as follows:

create table cust_p_19970130 partition of customer_p for values from ('19970129') to ('19970130');
create table cust_p_19970131 partition of customer_p for values from ('19970130') to ('19970131');
...
...
create table cust_p_20240615 partition of customer_p for values from ('20240614') to ('20240615');
create table cust_p_20240616 partition of customer_p for values from ('20240615') to ('20240616');


It takes around 3 minuites to complete the creation of the 10000 partitions.

You have to run the below query and verify that you have 10000 partitions:

select *
  from pg_class
 where relispartition is true
   and relname like 't1_p%';


First Problem: Locks
Now, when you run a simple SELECT query on this partitioned table, you might hit a snag:​

select * from t1;
SQL Error [53200]: ERROR: out of shared memory
Hint: You might need to increase max_locks_per_transaction.


This error occurred because I have set max_locks_per_transaction to 64 and max_connectons to 100.

In order to move on, let's set max_connections to 1000, and restart the PostgreSQL engine. 
We must change the value in the postgresql.conf file and restart the engine.

Sample Data Insertion on a partitioned table
Let's insert some data into the partitioned table:

insert into t1 select * from non_t1;

Partitioning Overhead: COUNT(*) Test
Now let's see the perfomance difference between a partitioned table and a non-partitioned table:

explain (analyze, buffers, settings)
select count(*) from non_t1;   --non-partitioned table

Aggregate  (cost=22525.22..22525.23 rows=1 width=8) (actual time=419.894..419.895 rows=1 loops=1)
  Buffers: shared read=4549
  ->  Index Only Scan using non_t1_pk on non_t1  (cost=0.42..20025.22 rows=1000000 width=0) (actual time=0.903..384.683 rows=1000000 loops=1)
        Heap Fetches: 0
        Buffers: shared read=4549
Settings: max_parallel_workers_per_gather = '0', random_page_cost = '1.1', search_path = 'scott, public, "$user"', plan_cache_mode = 'force_generic_plan'
Planning Time: 0.430 ms
Execution Time: 419.942 ms

explain (analyze, buffers, settings)
select count(*) from t1;    --partitioned table
Aggregate  (cost=37500.00..37500.01 rows=1 width=8) (actual time=335.088..336.390 rows=1 loops=1)
  Buffers: shared read=20000
  ->  Append  (cost=0.00..35000.00 rows=1000000 width=0) (actual time=0.100..279.504 rows=1000000 loops=1)
        Buffers: shared read=20000
        ->  Seq Scan on t1_p_19970130 t1_1  (cost=0.00..3.00 rows=100 width=0) (actual time=0.099..0.105 rows=100 loops=1)
              Buffers: shared read=2
        ....
        ->  Seq Scan on t1_p_20240616 t1_10000  (cost=0.00..3.00 rows=100 width=0) (actual time=0.007..0.012 rows=100 loops=1)
              Buffers: shared read=2
Settings: max_parallel_workers_per_gather = '0', random_page_cost = '1.1', search_path = 'scott, public, "$user"', plan_cache_mode = 'force_generic_plan'
Planning Time: 168.635 ms
Execution Time: 506.426 ms


We can see that planning time increased from 0.430 ms to 168.635 ms.

Indexed Search Test
Now Let's create an index on the CUST_NM column and check the performance difference.

create index non_t1_x01 on non_t1(cust_nm);
set default_tablespace to '';
create index t1_x01 on t1(cust_nm);

--Simple search comparison
explain (analyze, buffers)
select * from non_t1 where cust_nm = 'R148009CUST_NM';

Index Scan using non_t1_x01 on non_t1  (cost=0.42..2.64 rows=1 width=51) (actual time=0.036..0.036 rows=1 loops=1)
  Index Cond: ((cust_nm)::text = 'R148009CUST_NM'::text)
  Buffers: shared read=4
Planning:
  Buffers: shared hit=98 read=39
Planning Time: 3.029 ms
Execution Time: 0.057 ms

explain (analyze, buffers)
select * from t1 where cust_nm = 'R148009CUST_NM';
Append  (cost=0.14..23650.00 rows=10000 width=51) (actual time=4.287..25.390 rows=1 loops=1)
  Buffers: shared hit=10001
  ->  Index Scan using t1_p_19970130_cust_nm_idx on t1_p_19970130 t1_1  (cost=0.14..2.36 rows=1 width=51) (actual time=0.019..0.019 rows=0 loops=1)
        Index Cond: ((cust_nm)::text = 'R148009CUST_NM'::text)
        Buffers: shared hit=1
        ..........
  ->  Index Scan using t1_p_20240616_cust_nm_idx on t1_p_20240616 t1_10000  (cost=0.14..2.36 rows=1 width=51) (actual time=0.004..0.004 rows=0 loops=1)
        Index Cond: ((cust_nm)::text = 'R148009CUST_NM'::text)
        Buffers: shared hit=1
Planning Time: 353.952 ms
Execution Time: 182.478 ms


We can observe that in the partitioned table, planning time is significantly higer (3.029 ms to 353.952 ms), and query elapsed time is also much longer (0.057 ms to 182.47 ms).

Secondary Table as a "Global Index"
Will a secondary table help alleviate this parsing issue?
The following experiment examines whether creating a secondary table can reduce parsing time in PostgreSQL.

I have created a secondary table to serve as a global index:

create table cust_guide (
cust_nm         varchar(100) NULL,
register_dt      varchar(8) NULL,
seq_no          int4       NOT NULL
);
insert into cust_guide select cust_nm, register_dt, seq_no  from non_t1;

create index cust_guide_x01 on cust_guide(cust_nm, register_dt, seq_no);


Now let's check the performance of the query which uses this secondary table:

I have run a query against a non-partitioned table NON_T1:

explain (analyze, buffers)
select *
  from cust_guide a, non_t1 b 
 where a.seq_no = b.seq_no
   and a.register_dt = b.register_dt
   and a.cust_nm = 'R148009CUST_NM';

Nested Loop  (cost=0.85..4.19 rows=1 width=78) (actual time=0.805..0.808 rows=1 loops=1)
  Buffers: shared hit=5 read=3
  ->  Index Only Scan using cust_guide_x01 on cust_guide a  (cost=0.42..1.54 rows=1 width=27) (actual time=0.035..0.037 rows=1 loops=1)
        Index Cond: (cust_nm = 'R148009CUST_NM'::text)
        Heap Fetches: 0
        Buffers: shared hit=4
  ->  Index Scan using non_t1_pk on non_t1 b  (cost=0.42..2.64 rows=1 width=51) (actual time=0.763..0.763 rows=1 loops=1)
        Index Cond: (((register_dt)::text = (a.register_dt)::text) AND (seq_no = a.seq_no))
        Buffers: shared hit=1 read=3
Planning:
  Buffers: shared hit=131 read=6
Planning Time: 2.143 ms
Execution Time: 0.840 ms


I have also run a query against a partitioned table T1:

explain (analyze, buffers)
select *
  from cust_guide a, t1 b 
 where a.seq_no = b.seq_no
   and a.register_dt = b.register_dt
   and a.cust_nm = 'R148009CUST_NM';

Nested Loop  (cost=0.57..23776.54 rows=1 width=78) (actual time=1.200..1.919 rows=1 loops=1)
  Buffers: shared hit=6
  ->  Index Only Scan using cust_guide_x01 on cust_guide a  (cost=0.42..1.54 rows=1 width=27) (actual time=0.022..0.023 rows=1 loops=1)
        Index Cond: (cust_nm = 'R148009CUST_NM'::text)
        Heap Fetches: 0
        Buffers: shared hit=4
  ->  Append  (cost=0.14..23675.00 rows=10000 width=51) (actual time=0.023..0.740 rows=1 loops=1)
        Buffers: shared hit=2
        ->  Index Scan using t1_p_19970130_pkey on t1_p_19970130 b_1  (cost=0.14..2.36 rows=1 width=51) (never executed)
              Index Cond: (((register_dt)::text = (a.register_dt)::text) AND (seq_no = a.seq_no))
              ...............
        ->  Index Scan using t1_p_20240616_pkey on t1_p_20240616 b_10000  (cost=0.14..2.36 rows=1 width=51) (never executed)
              Index Cond: (((register_dt)::text = (a.register_dt)::text) AND (seq_no = a.seq_no))
Planning Time: 31575.263 ms
Execution Time: 478.231 ms


As we can see, parsing time skyrocketted from 2.143 ms for the non-partitioned table to 31575.263 ms for the partitioned table, despite using the secondary table as a join driver. This confirms that creating a secondary table containing search key and primary key columns does not address the underlying parsing overhead in large partitioned table.

Conclusion
1. As the number of partitions in a table increases, the potential lock-related overhead also grows.
2. Using too many partitions increase planning time significanly.
3. A secondary table containg search key and primary key columns does not mitigate parsing time.

Footnote
I owe much of this content to the book "PostgreSQL 튜닝 기술(하)".

Addendum: Using a Generic Plan
When I asked ChatGPT this issue, it advised me me to combine a prepared statement and plan_cache_mode = force_generic_plan with the partitioned table. But it is somewhat difficult to construct a prepared statement friendly architecture.
Anyway, I have performed the experiment to combine a prepared statement and plan_cache_mode = force_generic_plan with the partitioned table.

Firstly, I have set the plan_cache_mode to force_generic_plan:

set plan_cache_mode = force_generic_plan;

Then, I have created a prepared statement.

prepare test(text) as
select * from t1 where cust_nm = $1;

Then I have run the query.

explain (analyze, buffers)
execute test('R148009CUST_NM');

Append  (cost=0.14..23650.00 rows=10000 width=51) (actual time=796.977..800.163 rows=0 loops=1)
  Buffers: shared read=10000 written=1
  ->  Index Scan using t1_p_19970130_cust_nm_idx on t1_p_19970130 t1_1  (cost=0.14..2.36 rows=1 width=51) (actual time=0.438..0.438 rows=0 loops=1)
        Index Cond: ((cust_nm)::text = '$1'::text)
        Buffers: shared read=1
        .................
Append  (cost=0.14..23650.00 rows=10000 width=51) (actual time=796.977..800.163 rows=0 loops=1)
  Buffers: shared read=10000 written=1
  ->  Index Scan using t1_p_19970130_cust_nm_idx on t1_p_19970130 t1_1  (cost=0.14..2.36 rows=1 width=51) (actual time=0.438..0.438 rows=0 loops=1)
        Index Cond: ((cust_nm)::text = '$1'::text)
        Buffers: shared read=1
 Planning Time: 12.142 ms
 Execution Time: 959.030 ms


With a generic plan, planning time dropped from 353.952 ms to 12.142 ms, which is still higer than the parsing time (2.143 ms) of the non-partitioned table.
This demonstrates that plan caching can effectively alleviate excessive planning delays in hig-partition-count tables. But its parsing time is 6x higher than that of the non-partitioned table.

 

postgresdba.com