A recent question on a Naver Cafe prompted me to write this note.
https://cafe.naver.com/dbian/4988
The person who put up the question wanted to interpret an execution plan produced by PostgreSQL.
Here is the execution plan he wanted to read.
I will call this plan an original plan from now on.
--Original plan
QUERY PLAN
----------------------------------------------------------------------------------
Bitmap Heap Scan on t2 (actual time=0.569..5.816 rows=1000 loops=1)
Recheck Cond: ((c1 >= 1) AND (c1 <= 1000))
Heap Blocks: exact=1000
Buffers: shared hit=1002 read=2
-> Bitmap Index Scan on t2_idx01 (actual time=0.269..0.271 rows=1000 loops=1)
Index Cond: ((c1 >= 1) AND (c1 <= 1000))
Buffers: shared hit=2 read=2
He wanted to know what exact=1000 means. Does it mean that the query accessed 1,000 table blocks?
Or Does it mean that the query accessed 1,000 rows?
I don't know what the original query is but I think the search condition in the original query made a person confused because by coincidence the number of rows retrieved is identical to the number of table blocks the query accessed. So it is very hard to give a clear explanation of how to read the execution plan under investigation.
So I decided to run up a test script to explain how to read the plan.
CREATE TABLE t1
AS SELECT i as c1, i*20 as c2
FROM generate_series(1,10000000) a(i)
ORDER BY random();
CREATE INDEX t1_x01 ON t1 (c1);
Note that I created an index on the column c1 and the values of c1 in the table are not ordered.
In order to get the execution plan we want I changed some parameters and the size of work_mem.
set max_parallel_workers_per_gather=0;
SET ENABLE_INDEXSCAN=FALSE;
set enable_seqscan=false;
set work_mem = 64; --64 is the least value we can set to work_mem.
Below is a query I ran followed by its execution plan.
SELECT *
FROM T1
WHERE c1 BETWEEN 100000 AND 200000;
Bitmap Heap Scan on t1 (actual time=5.066..581.441 rows=100001 loops=1)
Recheck Cond: ((c1 >= 100000) AND (c1 <= 200000))
Rows Removed by Index Recheck: 8786038
Heap Blocks: exact=417 lossy=39317
Buffers: shared hit=40010
-> Bitmap Index Scan on t1_x01 (actual time=5.002..5.002 rows=100001 loops=1)
Index Cond: ((c1 >= 100000) AND (c1 <= 200000))
Buffers: shared hit=276
Planning:
Buffers: shared hit=19 read=1
Planning Time: 0.154 ms
Execution Time: 585.077 ms
Now I am going to give a detailed explanation of how to read the plan.
1. While accessing the t1_x01 index, PostgreSQL hit 276 blocks in the buffer cache.
But in the original plan, PostgreSQL hit 2 blocks in the buffer cache and read 2 blocks in the file system.
2. While accessing the table t1, PostgreSQL hit 39734 (40010 - 276) blocks in the buffer cache.
But in the original plan, PostgreSQL hit 1000 (1002-2) blocks in the buffer cache and read 0 (2-2) blocks in the file system.
3. The reason why PostgreSQL chooses to access a bitmap index scan is because the clustering factor of the index t1_x01 is bad. Recall that I used order by random() to make the clustering factor bad when I created the table t1. In order to reduce the number of block I/Os PostgreSQL first scan the index and constructs a bitmap of potential row locations. The bitmap is constructed in work_mem. When the size of work_mem is enough, one index block uses one bit to locate a row which is called an exact mode. When the size of work_mem is not enough, one index block uses one bit to locate a table block where the matching row is, which is called a lossy mode. PostgreSQL uses a lossy mode to reduce the size of a bitmap. In a lossy mode PostgreSQL has to check the search condition again againt a table block to retrieve the exact matching row, which is the reason why it takes more time in a lossy mode.
4. Our observation of the execution plan above (lossy = 39317) tells us that a lossy mode kicked in.
On the other hand in the original execution plan a lossy mode did not kick in.
In the original execution plan exact=1000 means it had access to 1000 table blocks. exact = 1000 has nothing to do with the number of rows. In the plan above the number of table block hit can also be calculated as follows:
417(=exact) + 39317(=lossy) = 39734
After accessing 39317 table blocks PostgreSQL checked the search condition again and removed 8786038 rows.
In the original plan where a lossy mode didn't kick in, there is no Rows Removed by Index Recheck~.
I hope this note will save someone a bit of time interpreting the execution plan.
Footnote
There are two options you can do to remove a lossy bitmap.
One method is to sort the data in the table t1 by the column c1. And the other is to increase the work_mem parameter.
Let us increase the work_mem parameter.
set work_mem = 64000;
I have rerun the query and got this execution plan.
Bitmap Heap Scan on t1 (actual time=18.417..117.585 rows=100001 loops=1)
Recheck Cond: ((c1 >= 100000) AND (c1 <= 200000))
Heap Blocks: exact=39734
Buffers: shared hit=40010
-> Bitmap Index Scan on t1_x01 (actual time=13.749..13.749 rows=100001 loops=1)
Index Cond: ((c1 >= 100000) AND (c1 <= 200000))
Buffers: shared hit=276
Planning:
Buffers: shared hit=4
Planning Time: 0.340 ms
Execution Time: 121.039 ms
As you can see PostgreSQL didn't use a lossy bitmap after increasing the parameter work_mem. I can't understand why the "Recheck Cond:" is still there in the execution plan. Even after vacuuming the table t1, it is still there. It is just a PostgreSQL bug, I think. There is no reason to check the index access condition again in the table t1.
Addendum
I highly recommend you to read the page 104 to 111 of the following slide.
https://www.slideshare.net/SiyeonAcademy/postgresql-20178
Added on May 29, 2022
All the index scans can run in parallel. During a Bitmap Index Scan, the building of the bitmap in the Bitmap Index Scan is always performed by a single process. When the bitmap is done, the scanning is performed in parallel in the Parallel Bitmap Heap Scan node:
Gather (actual time=623.578..958.508 rows=100001 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Bitmap Heap Scan on t1 (actual time=475.601..742.829 rows=33334 loops=3)
Recheck Cond: ((c1 >= 100000) AND (c1 <= 200000))
Heap Blocks: exact=20836
-> Bitmap Index Scan on t1_x01 (actual time=586.595..586.596 rows=100001 loops=1)
Index Cond: ((c1 >= 100000) AND (c1 <= 200000))
When the clustering factor is low, Bitmap Index Scan outperforms Index Scan by far.