SQLMDX

Eventually it's all about data

Query transformations in Oracle: view merging

Posted by sqlmdx on November 14, 2013

Query transformations in Oracle is a part of query optimization.
In short, query optimization can be divided into three parts
1. Logical optimization: query transformation
2. Physical optimization: searching for optimal access paths, join methods, join orders.
3. Cost Estimator: choosing the optimal plan.

Additional details can be found in a presentation Query Transformations by Joze Senegacnik.

In terms of Oracle SQL Engine, logical optimization is performed by Query Transformer block while physical optimization is done by Plan Generator.
Plan costing, comparison and choosing the one with the lowest cost is done by the Cost Estimator.
Additional details regarding Oracle SQL Engine can be found in a VLDB paper Closing the Query Processing Loop in Oracle 11g.

In older Oracle versions there were only so called heuristic-based transformations. They were applied without taking cost into account and with an assumption that they produce faster plan most of time. Since version 10gR1 Oracle introduced cost-based transformations which are applied only in case when transformed query has the lower cost than non-transformed. So in this case Plan Generator loops back to Query Transformer which leads to many more strategies evaluated to find the most efficient plan.

Let’s start consideration of transformations with analysis of view merging.
Documentation contains brief explanation of view merging

In view merging, the optimizer merges the query block representing a view into the query block that contains it. View merging can improve plans by enabling the optimizer to consider additional join orders, access methods, and other transformations.

For example, after a view has been merged and several tables reside in one query block, a table inside a view may permit the optimizer to use join elimination to remove a table outside the view. For certain simple views in which merging always leads to a better plan, the optimizer automatically merges the view without considering cost. Otherwise, the optimizer uses cost to make the determination. The optimizer may choose not to merge a view for many reasons, including cost or validity restrictions.

Let’s create sample tables and test following query with enabled and disabled view merging.

create table t as select 0 id from dual;

create table t1 as
select rownum id, rownum value from dual connect by level <= 10;

create table t2 as
select rownum id, lpad(' ', 4000, ' ') padding
  from dual
connect by level <= 100000;

begin
  dbms_stats.gather_table_stats(user,'t');
  dbms_stats.gather_table_stats(user,'t1');
  dbms_stats.gather_table_stats(user,'t2');
end;
/

It’s worth to mention that below query is a subject to simple view merging even though documentation states that “a view may not be valid for simple view merging because of subqueries in the select list”.
As you can see from below output, view merging was bypassed with no_merge hint but once hint is removed total cost jumped from 27260 to 27275 as well as execution time significantly increased.

SQL> select *
  2    from (select --+ no_merge
  3           t1.*, (select count(*) from t where t.id = t1.id) cnt
  4            from t1) t1
  5    join t2
  6      on t1.id = t2.id
  7   where t1.cnt <> 0;

no rows selected

SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST COST'));
SQL_ID  6fb0x9rtrmxrv, child number 0
-------------------------------------
select *   from (select --+ no_merge          t1.*, (select count(*)
from t where t.id = t1.id) cnt           from t1) t1   join t2     on
t1.id = t2.id  where t1.cnt <> 0

Plan hash value: 2082057663

---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        | 27260 (100)|      0 |00:00:00.03 |      33 |   4 |          |       |          |
|   1 |  SORT AGGREGATE     |      |     10 |      1 |            |     10 |00:00:00.01 |      30 |   2 |          |       |          |
|*  2 |   TABLE ACCESS FULL | T    |     10 |      1 |     3   (0)|      0 |00:00:00.01 |      30 |   2 |          |       |          |
|*  3 |  HASH JOIN          |      |      1 |     10 | 27260   (1)|      0 |00:00:00.03 |      33 |   4 |   755K|   755K|  183K (0)|
|*  4 |   VIEW              |      |      1 |     10 |     3   (0)|      0 |00:00:00.03 |      33 |   4 |          |       |          |
|   5 |    TABLE ACCESS FULL| T1   |      1 |     10 |     3   (0)|     10 |00:00:00.02 |       3 |   2 |          |       |          |
|   6 |   TABLE ACCESS FULL | T2   |      0 |    100K| 27256   (1)|      0 |00:00:00.01 |       0 |   0 |          |       |          |
---------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("T"."ID"=:B1)
   3 - access("T1"."ID"="T2"."ID")
   4 - filter("T1"."CNT"<>0)

27 rows selected.

SQL> select *
  2    from (select --
  3           t1.*, (select count(*) from t where t.id = t1.id) cnt
  4            from t1) t1
  5    join t2
  6      on t1.id = t2.id
  7   where t1.cnt <> 0;

no rows selected

SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST COST'));
SQL_ID  6nfdtzs74dtya, child number 0
-------------------------------------
select *   from (select --          t1.*, (select count(*) from t where
t.id = t1.id) cnt           from t1) t1   join t2     on t1.id = t2.id
where t1.cnt <> 0

Plan hash value: 1807903439

---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        | 27275 (100)|      0 |00:00:11.46 |     100K|    100K|       |       |          |
|   1 |  SORT AGGREGATE     |      |     10 |      1 |            |     10 |00:00:00.01 |      30 |   0 |          |       |          |
|*  2 |   TABLE ACCESS FULL | T    |     10 |      1 |     3   (0)|      0 |00:00:00.01 |      30 |   0 |          |       |          |
|*  3 |  FILTER             |      |      1 |        |            |      0 |00:00:11.46 |     100K|    100K|       |       |          |
|*  4 |   HASH JOIN         |      |      1 |     10 | 27260   (1)|     10 |00:00:11.46 |     100K|    100K|   951K|   951K| 1094K (0)|
|   5 |    TABLE ACCESS FULL| T1   |      1 |     10 |     3   (0)|     10 |00:00:00.01 |       3 |   0 |          |       |          |
|   6 |    TABLE ACCESS FULL| T2   |      1 |    100K| 27256   (1)|    100K|00:00:11.40 |     100K|    100K|       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("T"."ID"=:B1)
   3 - filter(<>0)
   4 - access("T1"."ID"="T2"."ID")

27 rows selected.

10053 trace shows that in first case simple view merging (SVM) was bypassed

*************************
CNT:     COUNT() to COUNT(*) not done.
SVM:     SVM bypassed: Query NO MERGE hint.
CVM:   Merging SPJ view SEL$1 (#0) into SEL$4 (#0)
Registered qb: SEL$16C51A37 0x185e362c (VIEW MERGE SEL$4; SEL$1)
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$16C51A37 nbfros=2 flg=0
    fro(0): flg=1 objn=0 hint_alias="T1"@"SEL$1"
    fro(1): flg=0 objn=90868 hint_alias="T2"@"SEL$1"

JE:   Considering Join Elimination on query block SEL$16C51A37 (#0)
*************************

while in second case it was successfully applied

*************************
CNT:     COUNT() to COUNT(*) not done.
CVM:   Merging SPJ view SEL$2 (#0) into SEL$1 (#0)
Registered qb: SEL$F5BB74E1 0x19445400 (VIEW MERGE SEL$1; SEL$2)
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$F5BB74E1 nbfros=2 flg=0
    fro(0): flg=0 objn=90868 hint_alias="T2"@"SEL$1"
    fro(1): flg=0 objn=90867 hint_alias="T1"@"SEL$2"

CVM:   Merging SPJ view SEL$F5BB74E1 (#0) into SEL$4 (#0)
Registered qb: SEL$55EFA6E7 0x19447e40 (VIEW MERGE SEL$4; SEL$F5BB74E1)
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$55EFA6E7 nbfros=2 flg=0
    fro(0): flg=0 objn=90868 hint_alias="T2"@"SEL$1"
    fro(1): flg=0 objn=90867 hint_alias="T1"@"SEL$2"

JE:   Considering Join Elimination on query block SEL$55EFA6E7 (#0)
*************************

It’s not obvious that in second case simple view merging takes an effect but not complex view merging (CVM) given the output of 10053.
But it could be proven by playing with hidden parameters “_simple_view_merging” and “_complex_view_merging”.
The latter doesn’t make any impact on second query while former disables the view merging and makes it as fast as the first one.

It’s known that simple view merging is heuristic-based transformation but it’s a delicate question whether complex view merging is cost-based transformation or not.
As documentation claims, complex view merging was introduced in 8.0.4 which is considerably earlier than cost-based transformations in 10gR1.
That means that complex view merging initially was heuristic-based and Oracle has been applying it whenever possible but since 10gR1 it’s applied only if it leads to reduced cost (but as will be described in post dedicated to or-expansion, in some cases even cost-based transformations may increase cost).
Final versions for both queries extracted from 10053 are following:

select "T1"."ID"      "ID",
       "T1"."VALUE"   "VALUE",
       "T1"."CNT"     "CNT",
       "T2"."ID"      "ID",
       "T2"."PADDING" "PADDING"
  from (select /*+ NO_MERGE */
         "SYS_ALIAS_1"."ID" "ID",
         "SYS_ALIAS_1"."VALUE" "VALUE",
         (select count(*) "COUNT(*)"
            from "SCOTT"."T" "T"
           where "T"."ID" = "SYS_ALIAS_1"."ID") "CNT"
          from "SCOTT"."T1" "SYS_ALIAS_1") "T1",
       "SCOTT"."T2" "T2"
 where "T1"."CNT" <> 0
   and "T1"."ID" = "T2"."ID"

and

select "SYS_ALIAS_1"."ID" "ID",
       "SYS_ALIAS_1"."VALUE" "VALUE",
       (select count(*) "COUNT(*)"
          from "SCOTT"."T" "T"
         where "T"."ID" = "SYS_ALIAS_1"."ID") "CNT",
       "T2"."ID" "ID",
       "T2"."PADDING" "PADDING"
  from "SCOTT"."T1" "SYS_ALIAS_1", "SCOTT"."T2" "T2"
 where (select count(*) "COUNT(*)"
          from "SCOTT"."T" "T"
         where "T"."ID" = "SYS_ALIAS_1"."ID") <> 0
   and "SYS_ALIAS_1"."ID" = "T2"."ID"

Let’s change ANSI syntax into native syntax and check the results.

SQL> select *
  2    from (select
  3           t1.*, (select count(*) from t where t.id = t1.id) cnt
  4            from t1) t1,
  5         t2
  6   where t1.id = t2.id
  7     and t1.cnt <> 0;

no rows selected

SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST COST'));
SQL_ID  ax9k5b79rrv1n, child number 0
-------------------------------------
select *   from (select          t1.*, (select count(*) from t where
t.id = t1.id) cnt           from t1) t1,        t2  where t1.id = t2.id
   and t1.cnt <> 0

Plan hash value: 2082057663

------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        | 27260 (100)|      0 |00:00:00.01 |      33 |    |          |          |
|   1 |  SORT AGGREGATE     |      |     10 |      1 |            |     10 |00:00:00.01 |      30 |    |          |          |
|*  2 |   TABLE ACCESS FULL | T    |     10 |      1 |     3   (0)|      0 |00:00:00.01 |      30 |    |          |          |
|*  3 |  HASH JOIN          |      |      1 |     10 | 27260   (1)|      0 |00:00:00.01 |      33 |   755K|   755K|  159K (0)|
|*  4 |   VIEW              |      |      1 |     10 |     3   (0)|      0 |00:00:00.01 |      33 |    |          |          |
|   5 |    TABLE ACCESS FULL| T1   |      1 |     10 |     3   (0)|     10 |00:00:00.01 |       3 |    |          |          |
|   6 |   TABLE ACCESS FULL | T2   |      0 |    100K| 27256   (1)|      0 |00:00:00.01 |       0 |    |          |          |
------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("T"."ID"=:B1)
   3 - access("T1"."ID"="T2"."ID")
   4 - filter("T1"."CNT"<>0)

27 rows selected.

So query plan is the same as one with no_merge hint and ANSI syntax.
10053 trace contains information that simple view merging was bypassed due to validity restrictions:

*************************
CNT:     COUNT() to COUNT(*) not done.
SVM:     SVM bypassed: invalidated.
JE:   Considering Join Elimination on query block SEL$1 (#0)
*************************

Final query looks as following:

select "T1"."ID"      "ID",
       "T1"."VALUE"   "VALUE",
       "T1"."CNT"     "CNT",
       "T2"."ID"      "ID",
       "T2"."PADDING" "PADDING"
  from (select "SYS_ALIAS_1"."ID" "ID",
               "SYS_ALIAS_1"."VALUE" "VALUE",
               (select count(*) "COUNT(*)"
                  from "MEGA"."T" "T"
                 where "T"."ID" = "SYS_ALIAS_1"."ID") "CNT"
          from "MEGA"."T1" "SYS_ALIAS_1") "T1",
       "MEGA"."T2" "T2"
 where "T1"."ID" = "T2"."ID"
   and "T1"."CNT" <> 0

So native join syntax leads to the same plan and the same final query as ANSI syntax with no_merge hint.
The 10053 shows that simple (as well as complex) view merging was bypassed in first and third cases but second case simple view merging took place which was followed with significantly degraded query performance.

Actually the root cause for performance degradation for second query is not a view merging but the fact that filter was applied after join. This behavior can be changed using push_subq hint.

SQL> select *
  2    from (select /*+ push_subq(@subq) */
  3           t1.*,
  4           (select /*+ qb_name(subq) */
  5             count(*)
  6              from t
  7             where t.id = t1.id) cnt
  8            from t1) t1
  9    join t2
 10      on t1.id = t2.id
 11   where t1.cnt <> 0;

no rows selected

SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST COST'));
SQL_ID  gn7npxhtyqh21, child number 0
-------------------------------------
select *   from (select /*+ push_subq(@subq) */          t1.*,
(select /*+ qb_name(subq) */            count(*)             from t
       where t.id = t1.id) cnt           from t1) t1   join t2     on
t1.id = t2.id  where t1.cnt <> 0

Plan hash value: 882604598

-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        | 27263 (100)|      0 |00:00:00.01 |      33 |    |  |          |
|   1 |  SORT AGGREGATE    |      |     10 |      1 |            |     10 |00:00:00.01 |      30 |    |  |          |
|*  2 |   TABLE ACCESS FULL| T    |     10 |      1 |     3   (0)|      0 |00:00:00.01 |      30 |    |  |          |
|*  3 |  HASH JOIN         |      |      1 |      1 | 27260   (1)|      0 |00:00:00.01 |      33 |   878K|   878K|  168K (0)|
|*  4 |   TABLE ACCESS FULL| T1   |      1 |      1 |     3   (0)|      0 |00:00:00.01 |      33 |    |  |          |
|   5 |   TABLE ACCESS FULL| T2   |      0 |    100K| 27256   (1)|      0 |00:00:00.01 |       0 |    |  |          |
-----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("T"."ID"=:B1)
   3 - access("T1"."ID"="T2"."ID")
   4 - filter(<>0)

27 rows selected.

Final query in this case looks as following:

select "SYS_ALIAS_1"."ID" "ID",
       "SYS_ALIAS_1"."VALUE" "VALUE",
       (select /*+ PUSH_SUBQ QB_NAME ("SUBQ") */
         count(*) "COUNT(*)"
          from "SCOTT"."T" "T"
         where "T"."ID" = "SYS_ALIAS_1"."ID") "CNT",
       "T2"."ID" "ID",
       "T2"."PADDING" "PADDING"
  from "SCOTT"."T1" "SYS_ALIAS_1", "SCOTT"."T2" "T2"
 where (select /*+ PUSH_SUBQ QB_NAME ("SUBQ") */
         count(*) "COUNT(*)"
          from "SCOTT"."T" "T"
         where "T"."ID" = "SYS_ALIAS_1"."ID") <> 0
   and "SYS_ALIAS_1"."ID" = "T2"."ID"

So the bottom line of this post is that simple view merging is not always beneficial.
If a filter on view is based on scalar subquery then it will be applied after join in case of view merging which may result in worse performance.

11.2.0.1 was used for demonstration.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: