설문조사
PostgreSQL/PPAS 관련 듣고 싶은 교육은


총 게시물 169건, 최근 0 건
   

PRAGMA AUTONOMOUS_TRANSACTION

글쓴이 : 모델광 날짜 : 2022-10-22 (토) 15:40 조회 : 3008

Last year while migrating from Oracle to PostgreSQL, I came across a UDF using an Oracle Autonomous Transaction. Autonomous Transaction provides a functionality to the developer in which it allows to do changes in a separate transaction and to save or discard that particular transaction without affecting the main session transaction. I know you may think what the hell I am talking about. Please be patient. At the end of this article you will get the hang of it.

Let's start off with the ORACLE function code we have to convert:


CREATE OR REPLACE FUNCTION F_SINGO_NEXT_VAL

(v_ymd IN VARCHAR2)

RETURNVARCHAR2

IS

PRAGMA AUTONOMOUS_TRANSACTION;

  v_last_singo_no VARCHAR2(17)

BEGIN

  UPDATEsingo_no

       SETlast_singo_no

            = 'S' || v_ymd ||

               LPAD(TO_CHAR(TO_NUMBER(

                              NVL(SUBSTR(last_singo_no,10,17),'0')

                              ) +1),8,'0')

   ;

   --업데이트 데이터가 없으면, 최초 채번이므로 INSERT 수행

  IF SQL%ROWCOUNT = 0 THEN

      INSERT INTO singo_no VALUES ('S'||v_ymd||'00000001');

  END IF;

   --채번값 GET

  SELECT last_singo_no

      INTOv_last_singo_no

     FROM singo_no;

  COMMIT;                           --transaction commit

  RETURNv_last_singo_no;

END;


Here is the DDL of the table SINGO_NO:


CREATE TABLE singo_no (LAST_SINGO_NO VARCHAR2(17) NOT NULL);

ALTER TABLE singo_no ADD CONSTRAINT singo_no_PK PRIMARY KEY(LAST_SINGO_NO);


The structure of the LAST_SINGO_NO value is as follows:


'S' + 'yyyymmdd' + 8 digit number  


For example, the column can have a value, 'S2022082900002345'.


Before dealing with Autonomous Transaction I want to address the primary key format. When we design the structure of an artificial identifier (or surrogate key), there are three strategies we can use:

1. Use a sequence object offered by DBMS.

2. Use a max(primary key value) + 1 value.

3. Create an intermediate table to manage a maximum primary key value and use the value in the table as a primary key.


Using a sequence object is the best approach to obtaining a unique value. But I frequently run across developers who want to have a primary key value containing some business rules. Business rules include, for example, when the row was inserted, where the row was created and then some. The idea of a primary key value preserving a business rule puts the system at risk.  If any of the components of the key changes, then all applications that access the key are affected. The moment the business rule changes, you have to modify all the primary key values and foreign key values in the child table as well. I had been confronted with a primary key value consisting of a region code and a sequence number and one day when the region code had to be divided into 3 region codes, the admistrators of the system panicked.

The max(pk)+1 approach seriously compromises concurrency, which means that we have to programmatically force only one session to get the next primary key value at a time.


The Oracle UDF under investigation is used to extract a unique value revealing some business rules which say that the primary key value must start with the character 'S' and 'yyyymmdd' followed by eight digit numbers. Looking at the code closely, we can notice that the function is defined as an autonomous transaction (PRAGMA AUTONOMOUS TRANSACTION).

In subsequent paragraphs I will explain the mechanism of autonomous transactions and I will give you a technique you might find useful in PostgreSQL.


Since PostgreSQL does not explicitly support autonomous transactions, I simply wrote following code ignoring the autonomous transaction functionality.


create table singo_no (

last_singo_no varchar(40) not null

);

alter table singo_no add primary key  (last_singo_no);


I have set the storage parameter as follows:


alter table singo_no set (autovacuum_vacuum_scale_factor= 0.0);

alter table singo_no set (autovacuum_vacuum_threshold= 150);

alter table singo_no set (autovacuum_vacuum_insert_threshold= 150);


I intended to trigger a VACUUM whenever the table is updated 150 times. If I don't change the default settings, the table can get bloated.


INSERT INTO singo_no VALUES('2021013100000001');


I have inserted one row into the SINGO_NO table to make a sql fuction. If there had been no rows in the table, I would have had to create a plpgsql function.


Here is the PostgreSQL function code - converted from the Oracle UDF.


create or replace function singo_next_val(v_ymd in varchar)

returnsvarchar

language sql volatile parallel unsafe

as

$$

UPDATEsingo_no

   SETlast_singo_no

             = 'S'||v_ymd||LPAD((

                          COALESCE(SUBSTR(last_singo_no,10,17),'0')::numeric

                                 +1)::text,8,'0')

RETURNINGlast_singo_no ;

$$


I have created a tiny table T1 and an anonymous code block to give an explanation of what an autonomous transaction is:


drop table t1;
CREATE TABLE T1 (C1 varchar(17), C2 varchar(100) not null);
ALTER TABLE T1
ADD CONSTRAINT T1_PK PRIMARY KEY (C1);


do
$main$
BEGIN
  insert into t1 values(singo_next_val('20221009'),'dummy');

  -- I have simulated some long-running query.

  PERFORM pg_sleep(10);
END;
$main$


In session 1, I have executed an anonymous code block.


--session 1

do
$main$
BEGIN
  insert into t1 values(singo_next_val('20221009'),'dummy');

  -- I have simulated some long-running query.

  PERFORM pg_sleep(10);
END;
$main$


As soon as I execute the anonymous block in session 1, I have opened a new session and run the following code in session 2:


--session 2

select singo_next_val('20221009');


We can observe that session 2 goes on waiting until session 1 completes. This is because session 1 holds the exclusive lock on the record in the SINGO_NO table and session 2 also tries to hold the lock on the record. In Oracle by defining the singo_next_val function as autonomous transaction, we would able to release the lock inside the anonymous block as soon as the INSERT statement is run.

To implement this functionality we can use the PostgreSQL dblink. The dblink actually opens a new connection and runs a query using this new connection. Any SQL operation executed via a dblink is committed immediately regardless of the main transaction in the anonymous block.


I have created the dblink extension.


create extension dblink;


The modified function below updates a row via a dblink. This update will get committed even if the calling transaction is rolled back:


create or replace function singo_next_val_at(v_ymd in varchar)

returnsvarchar

language sql volatile parallel safe

as

$body$

SELECT*

  FROM

      dblink('host=127.0.0.1 port=5444 user=scott dbname=analdb',

              format($$UPDATEsingo_no

                                     SETlast_singo_no

                                            = 'S'||%L||LPAD((

                                              COALESCE(SUBSTR(last_singo_no,10,17),'0')::numeric

                                                 +1)::text,8,'0')

                              RETURNINGlast_singo_no

                          $$, v_ymd)) as t1(last_singo_no varchar(17))

$body$


I have opened a session and executed an anonymous code block.


--session 1

do
$main$
BEGIN
  insert into t1 values(singo_next_val('20221009'),'dummy');

  -- I have simulated some long-running query.

  PERFORM pg_sleep(10);
END;
$main$


The function singo_next_val_at() above calls UPDATE via a dblink and the operation is committed immediately whether or not subsequent queries (PERFORM pg_sleep(10) in this block) fail.


As soon as I execute the anonymous block in session 1, I have opened a new session and run the following code in session 2:


--session 2

select singo_next_val_at('20210121');


Now note that we can get the result in session 2 even though the anonymous code block is still being executed in session 1. By implementing Autonomous Transaction in a PostgreSQL function we increased concurrency. In some circumstances the performance benefit is tremendous.


The PostgreSQL dblink is session-specific, which mans that each session will have to open a new DB connection and this increases response time. The dblink connection closes automatically at the end of the session.


Wrap Up

PostgreSQL doesn't have a direct equivalent for autonomous transaction but we can use dblink to get a similar functionality.


Footnote

When you migrate database from Oracle to PostgreSQL, you should be proactive in the project. The first thing you have to do is to communicate with stakeholders in the project. If there are some functionalites that PostgreSQL does not support,you have to convince stakeholders and customize the business requirement not to use the functionality.

When I first encountered the Oracle function under investigation, I tried to convince application architects to change the business rule of making up the primary key value saying the business rule in the Oracle database was too complicated, risky, and inefficient. We would be able to improve performance by using the sequence object. And there is not much benefit for encoding more than one fact (a combination of 'yyyymmdd' and identification numbers) in the primary key. If we change the format of the primary key value, we can eliminate the UDF, which will result in less contention and less wait events. The performance benefit we could get from not generating an intelligent key (a key which contains some business rules) is huge.


Addendum

The problem with the above function is that you are specifying the database credential in the function. The workaround for this is to use dblink_fdw to create the remote server and user mapping as follows:


CREATE EXTENSION IF NOT EXISTS postgres_fdw;
CREATE SERVER loopback_dblink FOREIGN DATA WRAPPER dblink_fdw
 OPTIONS (hostaddr '127.0.0.1', dbname 'analdb');
CREATE USER MAPPING FOR scott SERVER loopback_dblink
 OPTIONS (user 'scott', password 'tiger');


create or replace function singo_next_val_at2(v_ymd in varchar)
returns varchar
language sql volatile parallel safe
as
$body$
SELECT *
  FROM
      dblink('loopback_dblink',
              format ($$ UPDATE singo_no
                                     SET last_singo_no
                                            = 'S'||%L||LPAD((
                                              COALESCE(SUBSTR(last_singo_no,10,17),'0')::numeric
                                                 +1)::text,8,'0')
                              RETURNING last_singo_no
                          $$, v_ymd)) as t1(last_singo_no varchar(17))
$body$;

select singo_next_val_at2('20210121');


   

postgresdba.com