I do not suppose many projects
allow developers to create functions in a database because user defined
functions can incur a lot of problems and it is difficult to diagnose
and fix those problems. But following reminder about avoiding mistakes
is probably valid in a project where UDFs are allowed for some reason.
Last
year I worked on a project where I had to port Oracle functions to
PostgreSQL functions. And I was shocked to see more than 100 UDFs and
triggers in an AS-IS database. And it seemed that the database had no
problems with those many UDFs.
After
porting Oracle functions, some anomalies were reported in some queries
using UDFs. While doing a user acceptance test, some queries reported
different results even though the input value was the same. After a deep
investigation, we found that there were code errors in the function but PostgreSQL failed to report it.
Here
is a demonstration based on a common, trivial, but annoying error I ran
into in the project- it starts with a table and a user defined
function.
CREATE
TABLE employee (
empno numeric(5,0) NOT NULL,
ename character varying(10),
job character varying(9),
mgr numeric(5,0),
hiredate timestamp(0),
sal numeric(7,2),
comm numeric(7,2),
deptno numeric(2,0),
sido_nm character varying(100)
);
insert
into employee
select
i, chr(65+mod(i,26))||i::text||'NM'
,case when mod(i,10000)=0 then
'PRESIDENT'
when mod(i,1000) = 0 then 'MANAGER'
when mod(i,3)=0 then 'SALESMAN'
when mod(i,3)=1 then 'ANALYST'
when mod(i,3)=2 then 'CLERK'
end as job
,case when mod(i,10000)= 0 then null
when mod(i,1000)= 1 then 10000
when i >= 9000 then 1000
else ceiling((i+1000)/1000)*1000
end as mgr
, current_date - i
, trunc(random() * 10000) as sal
, trunc(random() * 10000) as com
, mod(i,12)+1 as deptno
, case when mod(i,3) = 0 then 'Jeonbuk'
when mod(i,3) = 1 then 'Kangwon'
else 'Chungnam'
end as sido_nm
from
generate_series(1,10000) a(i);
ALTER
TABLE employee ADD CONSTRAINT employee_pk
PRIMARY KEY (empno);
select
pg_relation_size('employee');
0.9Mbytes
The following pattern of function was the culprit of frequent anomalies.
CREATE OR REPLACE FUNCTION F_TEST(
p_deptno IN NUMERIC)
RETURNS VARCHAR
LANGUAGE PLPGSQL STABLE PARALLEL SAFE
AS
$$
DECLARE
V_SIDO_NM EMPLOYEE.SIDO_NM%TYPE;
BEGIN
SELECT SIDO_NM
INTO V_SIDO_NM
FROM EMPLOYEE
WHERE deptno = p_deptno;
RETURN v_sido_nm;
END;$$;
Did you notice the code error in the function? The code error should be quite easy to see in the above function - but in a more
complex example with a longer piece of code it might not be so visible.
The error is that the query in the function returns many rows (only
first row is used, but it can be an unexpected row). It is very common
for developers to overlook the possibility of returning many rows. The
problem is that when you create and execute the function, it does not
return any errors.
Here is the output from running the function:
analdb=# select f_test(2)
+---------+
| f_test |
+---------+
| Kangwon |
+---------+
So
when a developer creates a function and runs the function in the dev
environment, he or she can not notice that there is an error in the
code. So the developer decides to ship the code to the production
environment. And some time later a user gets a different result value in
the quries involving the function.
How can we avoid this nasty problem? Let's run the function, but precede it with setting plpgsql.extra_warnings to "all":
analdb=# set plpgsql.extra_warnings to 'all'
SET
analdb=# select f_test(2)
WARNING: query returned more than one row
HINT: Make sure the query returns a single row, or use LIMIT 1.
+---------+
| f_test |
+---------+
| Kangwon |
+---------+
Setting plpgsql.extra_warnings worked, so the function will still execute and return the result, but
we have had an explicit warning of exactly what we've done wrong and
some tips of what we can do to remove the error.
Conclusion
When you create a function and test it in a development environment, you should set plpgsql.extra_warnings to "all". In default settings - no warning is raised. PL/pgSQL is too tolerant.
Footnote
I am afraid I do not know what impact setting plpgsql.extra_warnings to "all" in the production environment will have. It has the possiblilty of increasing the CPU load.