PostgreSQL 은 특별합니다.
단순한 데이터베이스가 아니라, 하나의 어플리케이션 플랫폼이기 때문입니다.

공지사항

뉴스 및 기사

	총 게시물 73건, 최근 0 건

encoding 과 collate , ctype 관계

글쓴이 : PostgresDBA 날짜 : 2018-10-29 (월) 14:54 조회 : 8594

enterprisedb@[local]:5445:edb]

SQL> \l+

-----------+--------------+----------+------------+------------+

test03 | enterprisedb | UTF8 | C | C |

(7 rows)

enterprisedb@[local]:5445:edb]

SQL>

--------------------------- test.sql --------------------------------------

drop table if exists korean;

create table korean ( indx int, word varchar(30));

insert into korean values ( 1, '사과');

insert into korean values ( 2, '배');

insert into korean values ( 3, '자두');

insert into korean values ( 4, '망고');

insert into korean values ( 5, '파인애플');

insert into korean values ( 6, '포도');

insert into korean values ( 7, '딸기');

insert into korean values ( 8,'강아지');

insert into korean values ( 8, '관리자');

select * from korean order by word;

insert into korean select x.* from korean x, generate_series(1,100) y where indx not in (4);

vacuum analyze verbose korean;

select * from korean order by word;

select count(*) from korean;

select * from korean where word like '망%';

select * from korean where word like '%망%';

select * from korean where word like '%똠%';

create index indxx on korean(word);

explain (analyze, buffers) select * from korean;

explain (analyze, buffers) select * from korean where word like '망%'; ----- !!!

explain (analyze, buffers) select * from korean where word='망고';

------------------------------------------------------------------------------------------------

각 test01/ test02 / test03 에 대해 위 sql 을 테스트해봤습니다.

*한글 정렬 => test01 와 test03 정상 / test02 는 한글 정렬 안됨

* like 검색시 인덱스를 잘 타는지 => test03 만 정상 / test01 과 test02 는 무조건 full scan

즉 select * from korean where word like '망%'; 이 쿼리결과에 대해서 test01 와 test02 는 풀스캔인데요. 단 인덱스 생성시 create index indxx2 on korean(word collate "C"); 이렇게 하면 인덱스를 타긴합니다.

직접 테스트해보세요

정리하면

test03 (UTF8, C) 처럼 디비를 생성하는게 한글정렬도 잘되고, 인덱스 생성시에 collate 따로 지정하지 않아도 되고 가장 무난합니다.

윤명식

2018-10-29 (월) 16:03

기존에 생성된 디비가 test1,test2 번 이라면 아래 처럼 문자에 대해서는 억지로 인덱스를 쓰게 할수는 있으나. 처음 시작 하는 거라면 위에 언급된 것처럼 디비를 locale 을 C 로 하는게 여러면에서 좋을 듣 싶습니다.

explain (analyze, buffers) select * from korean where word >= '망' and word < '맞';

QUERY PLAN
───────────────────────────────────────────────────────────────────────────────────────────────────────────────
Index Scan using indxx on korean (cost=0.28..2.49 rows=1 width=12) (actual time=0.046..0.047 rows=1 loops=1)
Index Cond: (((word)::text >= '망'::text) AND ((word)::text < '맞'::text))
Buffers: shared hit=3
Planning Time: 0.188 ms
Execution Time: 0.066 ms
(5 rows)

댓글주소

모모와도도

2019-01-21 (월) 11:01

이 정보는 매우 중요한 거 같습니다. 감사합니다.

댓글주소

E-MAIL: root@postgresdba.com