dbms_comparison was introduced in 11g as a powerful tool to compare and synchronize table data in different tables.
Even though it’s quite flexible it has some limitations:
1. The tables are supposed to be in different databases. Therefore if you want to compare tables in the same database you will need to create fake db link.
2. The tables must have unique indexes.
Another consequence from the flexibility is that synchronizing is being applied in two steps (merge & delete) even though it could have been done using single merge statement.
Moreover those statements are based not only on tables that are synchronized but also on auxiliary data populated after comparison.
For the below two tables
create table t1(id, name, constraint pk_t1 primary key (id)) as select rownum + 1, 'name src' from dual connect by rownum <= 3; create table t2(id, name, constraint pk_t2 primary key (id)) as select rownum + 2, 'name dst' from dual connect by rownum <= 3;
dbms_comparison will generate two statements for synchronization:
merge into "TEST"."T2"@self t
using (select /*+ USE_NL(d s) */
s."ID" "ID", s."NAME" "NAME", d.rmt_rowid d_tar_rowid
from "TEST"."T1" s, "_USER_COMPARISON_ROW_DIF" d
where d.comparison_id = :cmp_id
and d.scan_id = :scan_id
and d.status = 2
and d.loc_rowid = s.rowid
and d.idx_val = s."ID") ss
on (t.rowid = ss.d_tar_rowid and t."ID" = ss."ID")
when matched then
update set t."NAME" = ss."NAME"
when not matched then
insert ("ID", "NAME") values (ss."ID", ss."NAME")
and
delete "TEST"."T2"@self
where (rowid, 'X' || "ID") in
(select d.rmt_rowid, 'X' || d.idx_val
from "_USER_COMPARISON_ROW_DIF" d
where d.comparison_id = :cmp_id
and d.scan_id = :scan_id
and d.status = 2
and d.loc_rowid is null)
However it’s possible to merge them like following:
merge into t2
using (select nvl(t2.id, t1.id) id, t1.name, nvl2(t1.id, 1, null) sign
from t2
full join t1
on t1.id = t2.id) t
on (t2.id = t.id)
when matched then
update
set t2.name = t.name
where nvl(t2.name, chr(1)) <> nvl(t.name, chr(1))
delete
where t.sign is null
when not matched then
insert
values
(t.id, t.name)
Last statement could be generated dynamically based on tables metadata so such approach remains more preferable for me because allows full control of performance.