dbms_comparison was introduced in 11g as a powerful tool to compare and synchronize table data in different tables.
Even though it’s quite flexible it has some limitations:
1. The tables are supposed to be in different databases. Therefore if you want to compare tables in the same database you will need to create fake db link.
2. The tables must have unique indexes.
Another consequence from the flexibility is that synchronizing is being applied in two steps (merge & delete) even though it could have been done using single merge statement.
Moreover those statements are based not only on tables that are synchronized but also on auxiliary data populated after comparison.
For the below two tables
create table t1(id, name, constraint pk_t1 primary key (id)) as select rownum + 1, 'name src' from dual connect by rownum <= 3; create table t2(id, name, constraint pk_t2 primary key (id)) as select rownum + 2, 'name dst' from dual connect by rownum <= 3;
dbms_comparison will generate two statements for synchronization:
merge into "TEST"."T2"@self t using (select /*+ USE_NL(d s) */ s."ID" "ID", s."NAME" "NAME", d.rmt_rowid d_tar_rowid from "TEST"."T1" s, "_USER_COMPARISON_ROW_DIF" d where d.comparison_id = :cmp_id and d.scan_id = :scan_id and d.status = 2 and d.loc_rowid = s.rowid and d.idx_val = s."ID") ss on (t.rowid = ss.d_tar_rowid and t."ID" = ss."ID") when matched then update set t."NAME" = ss."NAME" when not matched then insert ("ID", "NAME") values (ss."ID", ss."NAME")
and
delete "TEST"."T2"@self where (rowid, 'X' || "ID") in (select d.rmt_rowid, 'X' || d.idx_val from "_USER_COMPARISON_ROW_DIF" d where d.comparison_id = :cmp_id and d.scan_id = :scan_id and d.status = 2 and d.loc_rowid is null)
However it’s possible to merge them like following:
merge into t2 using (select nvl(t2.id, t1.id) id, t1.name, nvl2(t1.id, 1, null) sign from t2 full join t1 on t1.id = t2.id) t on (t2.id = t.id) when matched then update set t2.name = t.name where nvl(t2.name, chr(1)) <> nvl(t.name, chr(1)) delete where t.sign is null when not matched then insert values (t.id, t.name)
Last statement could be generated dynamically based on tables metadata so such approach remains more preferable for me because allows full control of performance.