Recently, I delivered a lecture aimed at SQL Server developers who are new to PostgreSQL. While preparing, I had to present guidelines on how to translate the CROSS APPLY and OUTER APPLY operators into PostgreSQL.
In SQL Server, the CROSS APPLY and OUTER APPLY operators are used to join a table to subqueries or row-returning functions that reference columns from the driving (outer or preceding) table. They are particularly useful when the right-side table of the join depends on values from the left-side table.
In this note, I will show you how we can port SQL Server's CROSS APPLY statements to PostreSQL.
As usual, I will start with sample codes to create test data in PostgreSQL 16.3:
create table M (id int, val int, status varchar(3));
create table S (id int, val int, status varchar(3));
begin;
insert into M values (1, 10, 'CUR');
insert into M values (2, 20, 'NEW');
insert into M values (3, 30, 'CUR');
insert into M values (5, 30, 'DEL');
insert into M values (6, 5, 'DEL');
insert into S values (2, 25, 'CUR');
insert into S values (3, 15, 'CUR');
insert into S values (4, 41, 'CUR');
insert into S values (5, 10, 'DEL');
commit;
--Displaying the data
select * from M;
| id | val | status |
+----+-----+--------+
| 1 | 10 | CUR |
| 2 | 20 | NEW |
| 3 | 30 | CUR |
| 5 | 30 | DEL |
| 6 | 5 | DEL |
select * from S;
| id | val | status |
+----+-----+--------+
| 2 | 25 | CUR |
| 3 | 15 | CUR |
| 4 | 41 | CUR |
| 5 | 10 | DEL |
This script can also be used in SQL Server. As a side note, in SQL Server, varchar(n) indicates that the database stores 'n' bytes of data in the column. On the other hand, in PostgreSQL, varchar(n) indicates that the database stores 'n' characters in the column.
Here are the two statements we have to port from SQL Server:
SELECT A.ID, A.VAL, A.STATUS, V.ID, V.VAL, V.STATUS
FROM M A
CROSS APPLY (
SELECT ID, VAL, STATUS
FROM S B
WHERE B.ID = A.ID
) AS V
WHERE A.STATUS = 'DEL';
SELECT A.ID, A.VAL, A.STATUS, V.ID, V.VAL, V.STATUS
FROM M A
OUTER APPLY (
SELECT ID, VAL, STATUS
FROM S B
WHERE B.ID = A.ID
) AS V
WHERE A.STATUS = 'DEL';
In the first query, for each row in the M table where status = 'DEL', the subquery in the CROSS APPLY is executed. the subquery retrieves rows from the S table where the ID matches the ID of the current row from the M table.
In the second query, it works similarly but the difference is that if the OUTER APPLY subquery finds no matching rows in the S table, the row from the M table is included in the result and the columns V.ID, V.val, and V.STATUS are returned as NULL.
In PostgreSQL, the concept equivalent to SQL Server's CROSS APPLY and OUTER APPLY is accomplished using the LATERAL join. The LATERAL keyword allows a subquery in the FROM clause to refer to columns of driving tables in the FROM clause, much like APPLY does in SQL Server.
Here is how you can effectively port SQL Server's APPLY operations to PostgreSQL.
--Porting CROSS APPLY to PostgreSQL using LATERAL JOIN
SELECT A.ID, A.VAL, A.STATUS, V.ID, V.VAL, V.STATUS
FROM M A
JOIN LATERAL (
SELECT ID, VAL, STATUS
FROM S B
WHERE B.ID = A.ID
) AS V
ON TRUE
WHERE A.STATUS = 'DEL';
--Porting OUTER APPLY to PostgreSQL using LEFT JOIN LATERAL
SELECT A.ID, A.VAL, A.STATUS, V.ID, V.VAL, V.STATUS
FROM M A
LEFT JOIN LATERAL (
SELECT ID, VAL, STATUS
FROM S B
WHERE B.ID = A.ID
) AS V
ON TRUE
WHERE A.STATUS = 'DEL';
Both ported queries use a lateral subquery, which can refer to columns of the preceding table M in the FROM clause. This is critical for parameterized queries dependent on each row of the driving table. The ON TRUE condition is used to mimic the behavior of APPLY in SQL Server, ensuring that the JOIN operates like a Cartesian product filtered by the subquery's Where clause, similar to how APPLY functions.
Conclusion
We can leverage LATERAL joins to port SQL Server's CROSS APPLY and OUTER APPLY statements effectively. This technique ensures a smooth transition and functional equivalence in PostgreSQL for complex queries involving dependent subquries.
Added on May 15, 2024
In SQL Server, when you use the 'NVARCHAR(n)' data type, the database stores up to 'n' characters in the column. For Korean characters, SQL Server uses 2 bytes per character in an 'NVARCHAR' column. This is more storage-efficient compared to PostgreSQL, where each Korean character typically requires 3 bytes when using the 'VARCHAR' type with UTF-8 encoding.