Recently while watching a YouTube
video, it has just occurred to me that I do something stupid when I do
not know the technology I work with.
Here is the URL of the YouTube video.
(Warning! If you are under 19 years old, do not watch the video.)
https://www.youtube.com/watch?v=8rh6qqsmxNs
At the 40 second part of the video, the comedian said "Think about how stupid the average person is, and then realize that half of them are stupider than that."
Last
year when I worked on a project where I had to port Oracle functions
to PostgreSQL functions, there was a user defined function which pushed
me to do something stupid.
Let me kick things off by introducing an Oracle function code that I had to port.
CREATE OR REPLACE FUNCTION F_CONNECT_BY(
P_GROUP_CD VARCHAR2 )
RETURNVARCHAR2
IS
v_string VARCHAR2(2000);
BEGIN
SELECTSUBSTR(SYS_CONNECT_BY_PATH(CD_NM,','),2)
INTOv_string
FROM(
SELECTCD_NM
, ROW_NUMBER()
OVER (ORDER BY CD_NM) RN
, COUNT(*)
OVER () AS CNT
FROMCOM_CODE
WHEREGROUP_CD = p_group_cd
)
WHERE RN =
CNT --check
condition
START
WITH RN =
1
--access condition
CONNECT
BY RN
= PRIOR RN+1; --join
condition
RETURNv_string;
END;
The following is the DDL for the table COM_CODE.
create
table com_code (
group_cd varchar(10),
cd varchar(10),
cd_nm varchar(100));
insert
into com_code values ('G1','11001','SEOUL')
,('G1','11002','PUSAN')
,('G1','11003','INCHEON')
,('G1','11004','DAEGU')
,('G1','11005','JAEJU')
,('G1','11006','ULEUNG')
,('G1','11007','ETC');
insert
into com_code values ('G2','1','Infant')
,('G2','2','Child')
,('G2','3','Adolescent')
,('G2','4','Adult')
,('G2','5','Senior');
insert
into com_code values ('G3','01','Jeonbuk')
,('G3','02','Kangwon')
,('G3','03','Chungnam');
alter
table com_code add constraint com_code_pk
primary key (group_cd, cd);
Before reading this article on, think for a minute, "how would you write the above function code in PostgreSQL?"
PostgreSQL does not support the START WITH, CONNECT BY, PRIOR, and SYS_CONNECT_BY_PATH things. As soon as I saw the Oracle function code, I started to write a PostgreSQL function code using WITH RECURSIVE CLAUSE like the following:
WITH recursive W1 AS (
SELECT group_cd, cd, cd_nm::varchar
, row_number() over (order by cd_nm) rn
, count(*) over () as cnt
FROM COM_CODE
WHERE group_cd = 'G3'
)
, W AS (
SELECT group_cd, cd, cd_nm, rn, cnt
FROM W1
WHERE group_cd = 'G3'
UNION ALL
SELECT a.group_cd, a.cd, a.cd_nm||','||w.cd_nm::varchar as cd_nm, w.rn+1 as rn, w.cnt
FROM w1 a join lateral (select *
from w
where a.group_cd = w.group_cd
and a.rn <= w.rn
offset 0) w
ON true
AND w.rn <= a.cnt
)
SELECT *
FROM w
order by cd_nm;
Even
though I spent about 2 hours trying to come up with an alternative
PostgreSQL code to the Oracle code, I failed. I did a lot of googling
and refered to the following posts:
https://stackoverflow.com/questions/35248217/multiple-cte-in-single-query
https://heap.io/blog/postgresqls-powerful-new-join-type-lateral
Desperately
when I ran the SQL statement in the Oracle function allocating the
value 'G1' to the p_group_cd variable I got the following result:
DAEGU,ETC,INCHEON,JAEJU,PUSAN,SEOUL,ULEUNG |
As you can see the result of the Oracle user defined function was the output of a built-in LIST_AGG() function. That was the mement when I realized that the writer of the F_CONNECT_BY() was an idiot and I was also an idiot. The Oracle function writer and I
were trying to rewrtie the database built-in function. And PostgreSQL
implements the same functionality with STRING_AGG().
Conclusion
Database
companies have spent a lot of money developing database features. I am
semi-serious. In PostgreSQL the stupidest PostgreSQL core team member is
probably smarter than you. You need to know the technology you are working with. Do not try to rewrite the database feature on your own.
Footnote
In
the video I introduced at the beginning of this note, the comedian went
on to say "It does not take long to spot one of them (stupid people),
does it? It takes you about eight seconds." But it took me about 2 hours
to realize that I was an idiot. I must be one of those who are stupider
than the average stupid person.
Addendum
If
you are a God-demn idiot and love codes of high complexity, you might
have ported the Oracle function to PostgreSQL like the following:
CREATE OR REPLACE FUNCTION F_CONNECT_BY(
P_GROUP_CD VARCHAR)
RETURNS VARCHAR
LANGUAGE SQL STABLE PARALLEL SAFE STRICT
AS
$$
WITH recursive W1 AS (
SELECT group_cd, cd, cd_nm, rn, lag(rn) over(order by rn) as rn2, count(rn) over() as cnt
FROM (
SELECT group_cd, cd, cd_nm
, row_number() over (order by cd_nm desc) rn
FROM COM_CODE
WHERE group_cd = P_GROUP_CD
) AS FOO
)
, W AS (
SELECT group_cd, cd, cd_nm::varchar, rn, rn2, cnt
FROM W1
WHERE 1=1
AND rn2 IS NULL
UNION ALL
SELECT a.group_cd, a.cd, a.cd_nm||','||w.cd_nm::varchar as cd_nm, a.rn, a.rn2 as rn2, a.cnt
FROM w1 a join lateral (select * -- I forced a nested loop join to keep the ordering.
from w
where w.group_cd = a.group_cd
and w.rn = a.rn2
offset 0) w
ON true
)
SELECT cd_nm
FROM w
WHERE w.cnt = w.rn;
$$