설문조사
PostgreSQL/PPAS 관련 듣고 싶은 교육은


총 게시물 162건, 최근 0 건
   

Database Design and Performance

글쓴이 : 모델광 날짜 : 2021-10-29 (금) 05:48 조회 : 1560

When I am involved in a project as an SQL tuner, from time to time, there are queries that I cannot optimize. Below are several factors which prevent sql tuners or developers from writing performant queries.

* The query tuner or developer doesn't understand the business intent.

* The database design is not right for the usage of the data.


It is very tough to evaluate someone else's database design. If you do not have any exposure to the business specifics and the purpose of the new tables and views, there is not much you can do to determine whether the proposed design is optimal. The only thing you may be able to evaluate without knowing the business requirements is whether the database design is normalized. However, the fact that the table is normalized doesn't guarantee that the queries on the table can be performant.


Thus, query optimization starts with gathering requirements. To illustrate the argument, let's look at the following example: in this database, we need to store dealer ids, and we need to store the date on which each dealer wrote their signature. Two possible designs are shown below.


* Denormalized table design

CREATE TABLE dealer_m (

dealer_id                  int not null,

dealer_name            varchar(50) not null,

dealer_jumin_no       varchar(13) not null,

phone_no                 varchar(13),

email                       varchar(100),

declaration_sign_date date not null,

release_sign_date      date,

update_sign_date      date,

change_sign_date      date);

ALTER TABLE dealer_m ADD CONSTRAINT dealer_id_pk PRIMARY KEY (dealer_id);


* Normalized table design

CREATE TABLE dealer_m (

dealer_id               int not null,

dealer_name         varchar(50) not null,

dealer_jumin_no    varchar(13) not null,

phone_no              varchar(13),

email                    varchar(100));

ALTER TABLE dealer_m ADD CONSTRAINT dealer_id_pk PRIMARY KEY (dealer_id);


CREATE TABLE dealer_sign_m (

dealer_id              int not null,

sign_type_code     varchar(1) not null

sign_date              date          );

ALTER TABLE dealer_sign_m ADD CONSTRAINT dealer_sign_m_pk PRIMARY KEY (dealer_id, sign_type_code);

ALTER TABLE dealer_sign_m ADD CONSTRAINT dealer_sign_m_fk FOREIGN KEY (dealer_id) REFERENCES dealer_m(dealer_id);


Which of the two designs is the right one? It depends on the intended usage of the data. If the dates on which the dealer writes their signature are never used as search criteria and are selected as a part of a dealer_id, if UX has fields labeled with specific signature types, then a denormalized design is more appropriate.

However, if we want to search by the signature date regardless of type, having all the signature dates in a separate table will make the search more performant.

Also, there are chances that business requirements can be changed. Taking this into account the latter design is more desirable.

Lastly, we can evaluate the frequency of each use case and how critical response time is in each case.


Conclusion

Ideally, optimization begins at the time of gathering business requirements and designing the database. We should be aware of the database design and its implications on the query performance.


Addendum

If you think database design is some ancient thing that shouldn't be done anymore, you might be interested in Doctor Stonebraker's blog post here:


https://www.enterprisedb.com/blog/schema-later-considered-harmful


Added on 5, March 2022

Now I am working as a data modeler on a next generation project in Public Procurement Service. When I try to normalize entities in the To-Be system, application program designers are reluctant to changing the data model. Application designers and I are almost fighting.

Application designers argue, "if we modify the data model, we have to change application source codes a lot. On top of that, there is a strong chance that we may not meet the project deadline. Why do we have to join many tables to extract the result? In an AS-IS database we could get the requested data without joins."

One of the ways to have a successful project is to delight the end user. Here, the end user is the management in Public Procurement Service. If the project team doesn't meet the deadline, the end user will not be happy. So from the viewpoint of application designers and developers who focus on efficient development, modifying the data model is a risk.

Whenever I work as a data modeler, I have conflicts of two passions in me. One passion is based on what is good for the management (or the customer) in the long run. The other passion is based on what is good for the project team. For the project team avoiding risks is a top priority. For the customer making the data model robust is a top priority.

I strongly belive that normalized database design enhances flexibility in a system which enables us to cope with business requirement changes with ease. Joins are not barriers to performance. A robust relational data model is an excellent communication tool, removes duplicates and improves integrity, which improves system performance.

To a data modeler, normalizing entities does not take a lot of energy. So I am trying to persuade application designers to agree on the normalized database design.


   

postgresdba.com