There are a few details in PostgreSQL data type conversion that might make a dramatic difference to performance for a small investment in effort. The performance difference is so significant that I decided to write about it.
In PostgreSQL, you can use the '::' opoerator for data type cating, allowing you to explicitly convert values from one data type to another. In most cases the '::' operators perform better than built-in data type conversion functions. This article examines the performance difference in the context of a real life production query.
Here is some code to demonstrate the point. I have tested it on PostgreSQL 15.1.create table online_order (
ord_no bigint not null,
cust_no int not null,
ord_date timestamp(0) not null,
ord_dt varchar(8) not null,
ord_status_cd varchar(1) not null,
comment varchar(100)
);
insert into online_order
select i, mod(i,1000000) as cust_no
,current_date - mod(i,1000) as ord_date
,to_char((current_date - mod(i,1000)),'yyyymmdd') as ord_dt
,(mod(i,4) + 1) as ord_status_cd
,lpad('x',100,'x')
from generate_series(1,2000000,2) a(i);
alter table online_order add constraint online_order_pk primary key (ord_no);
Here are the two queries we need to investigate. One is using the ::timestamp operator and the other the to_timestamp() function. Before each run, I ensured that the ONLINE_ORDER table was loaded into memory using the pg_prewarm extension:SELECT pg_prewarm('online_order');
set max_parallel_workers_per_gather = 0;
SELECT *
FROM ONLINE_ORDER
WHERE ORD_DATE = '20211201'
::TIMESTAMP;
SELECT *
FROM ONLINE_ORDER
WHERE ORD_DATE =
TO_TIMESTAMP('20211201','YYYYMMDD');
Here are the execution plans that we get from running the EXPLAIN command.Seq Scan on online_order (cost=0.00..63462.00 rows=1988 width=128)
(actual time=0.011..152.770 rows=2000 loops=1)
Filter: (ord_date = '2021-12-01 00:00:00'
::timestamp without time zone)
Rows Removed by Filter: 1998000
Buffers: shared hit=3516 read=34946
Planning Time: 0.048 ms
Execution Time: 152.956 ms
Seq Scan on online_order (cost=0.00..68462.00 rows=1988 width=128)
(actual time=0.017..638.778 rows=2000 loops=1)
Filter: (ord_date =
to_timestamp('20211201'::text, 'YYYYMMDD'::text))
Rows Removed by Filter: 1998000
Buffers: shared hit=3612 read=34850
Planning Time: 0.095 ms
Execution Time: 639.033 ms
'::timestamp" is a type cast operator that is used to explicitly convert a value from one data type to a timestamp data type. "to_timestamp()" is a built-in function in PostgreSQL that is used to convert a string representation of a timestamp into an actual timestamp data type. The choice between them depends on whether you are working with existing data of a different type that needs to be treated as a timestamp or if you need to parse and convert a string representation into a timestamp.
Note that when we use the ::timestamp operator, the cost is 63462. However, when we use the to_timestamp function, the cost rises to 68462. The cost difference is not dramatic. However, the elapsed time increased from 152 ms to 639ms. We can conclude that the second query is spending most of its time executing the to_timestamp() function.
This is one of the little details that can make a big difference in performance when we convert a literal value into another data type.
Conclusion
If you encounter SQL that converts a value into another data type and you notice the built-in function being used, take a little time to consider whether the built-in function can be replaced with the '::' operator.
In PostgreSQL, when you convert a literal value into another data type, it is preferable to use the '::' operator rather than built-in functions.
Footnote
Here are som common data type conversions using the '::' operator.
1. Casting to Text:
'value::text' or 'value::varchar' can be used to cast a vaule to text.
2. Casting to Integer
'value::integer' is used to cast a value to an integer.
3. Casting to Numeric
'value::numeric' is used to cast a value to a fixed-point number.
4. Casting to Boolean
'value::boolean' is used to cast a value to a boolean (true/false)
5. Casting to Date
'value::date' is used to cast a value to a date.
6. Casting to array
'value::array_type' is used to cast a value to an array
7. Casting to Inet
'value::inet' is used to cast a value to IPv4 or IPv6 address.