In this I’ll do a quick performance comparison of built-in Oracle Text index versus the one created using Extensible indexing interface (described in this post).
First of all, Oracle Text option must be installed to use built-in text indexes. This functionality is getting installed under CTXSYS schema.
Speaking about DIY text indexes, it’s mandatory to have privileges to create index types and operators or simply RESOURCE role.
Once all permissions are granted, below statement can be used to simulate functionality implemented in tp_instr using built-in capabilities.
begin
ctx_ddl.drop_preference('mywordlist');
ctx_ddl.create_preference('mywordlist', 'BASIC_WORDLIST');
ctx_ddl.set_attribute('mywordlist','SUBSTRING_INDEX', 'YES');
end;
/
create index idx_t_str_ctx on t(str) indextype is ctxsys.context parameters ('Wordlist mywordlist');
Similar to the approach used in tp_instr Oracle creates “ladder” table.
SQL> select * from dr$idx_t_str_ctx$p
2 where pat_part1 || pat_part2 = upper('DRNXIrhioCpVQZtOrACL')
3 order by length(pat_part2);
PAT_PART1 PAT_PART2
------------------------- -------------------------
DRNXIRHIOCPVQZTOR ACL
DRNXIRHIOCPVQZTO RACL
DRNXIRHIOCPVQZT ORACL
DRNXIRHIOCPVQZ TORACL
DRNXIRHIOCPVQ ZTORACL
DRNXIRHIOCPV QZTORACL
DRNXIRHIOCP VQZTORACL
DRNXIRHIOC PVQZTORACL
DRNXIRHIO CPVQZTORACL
DRNXIRHI OCPVQZTORACL
DRNXIRH IOCPVQZTORACL
DRNXIR HIOCPVQZTORACL
DRNXI RHIOCPVQZTORACL
DRNX IRHIOCPVQZTORACL
DRN XIRHIOCPVQZTORACL
DR NXIRHIOCPVQZTORACL
D RNXIRHIOCPVQZTORACL
17 rows selected.
Performance test brings below timings
set timing on
set serveroutput on
declare
x int;
ts timestamp;
n int := 1000;
begin
ts := systimestamp;
for i in 1 .. n / 100
loop
select count (*)
into x
from t
where upper (str) like '%ORAC%';
end loop;
dbms_output.put_line (systimestamp - ts);
dbms_output.put_line (x);
ts := systimestamp;
for i in 1 .. n
loop
select count (*)
into x
from t
where contains (str, '%ORAC%') > 0;
end loop;
dbms_output.put_line (systimestamp - ts);
dbms_output.put_line (x);
ts := systimestamp;
for i in 1 .. n
loop
select count (*)
into x
from t
where op_instr (str, lower ('ORAC')) = 1;
end loop;
dbms_output.put_line (systimestamp - ts);
dbms_output.put_line (x);
ts := systimestamp;
for i in 1 .. n
loop
select count (*)
into x
from idx_t_str$
where sub_str like lower ('ORAC%');
end loop;
dbms_output.put_line (systimestamp - ts);
dbms_output.put_line (x);
end;
/
+000000000 00:00:02.739000000 38 +000000000 00:00:02.170000000 38 +000000000 00:00:00.380000000 38 +000000000 00:00:00.022000000 38
So built-in index is 100 times faster then approach with like (similar timings but 100 times more executions). DIY text index is about 5 times faster then built-in one.
And finally, the statement which queries data from auxiliary table is approximately 20 times faster then DIY index.
This clearly shows that irrespective whatever approach you choose for domain indexing it will introduce noticeable overhead. Further details can be found using dbms_hprof and SQL tracing.