The purpose of collecting optimizer statistics is to maintain
up-to-date statistics that the Oracle Rdb optimizer uses
to determine solution costs and cardinalities during query
optimization.
You can collect cardinality and storage statistics by issuing the
RMU Collect Optimizer_Statistics command. You can direct Oracle
RMU to collect these statistics for particular tables or indexes
by using the Tables, System_Relations, or Indexes qualifiers.
Before you can collect workload statistics, you must first
generate a workload profile with SQL. The following list
describes the general procedure for generating a workload profile
and collecting workload statistics:
1. Enable workload profiling with the WORKLOAD COLLECTION
IS ENABLED clause of the SQL ALTER DATABASE or SQL CREATE
DATABASE statement.
SQL creates a new system table called RDB$WORKLOAD.
2. Execute the queries for which you want the Oracle Rdb
optimizer to have the best possible statistics.
When you execute the queries, the optimizer determines which
groups of columns are important for optimal processing of the
query. These groups of columns are referred to as workload
column groups. Note that a workload column group may actually
contain only one column.
Each set of workload column groups is entered as a row in the
RDB$WORKLOAD system table.
At this point, the only data in the RDB$WORKLOAD system table
are the workload column groups, the tables with which the
column group is associated, and the date they were entered
into the table. No statistics are currently recorded in the
RDB$WORKLOAD system table.
3. In most cases, now you disable workload profiling with the SQL
ALTER DATABASE WORKLOAD COLLECTION IS DISABLED clause.
Queries executed after you disable workload profiling are
not scanned by the Oracle Rdb optimizer for workload column
groups. You can leave the workload profiling enabled if the
same queries are always executed. In such a case, no new rows
are entered into the RDB$WORKLOAD system table. However, if
you anticipate that queries will be executed for which you do
not want workload profiling to be enabled, you need to disable
workload collection.
4. Execute an RMU Collect Optimizer_Statistics command with the
Statistics=(Workload) qualifier.
Oracle RMU reads the RDB$WORKLOAD system table to determine
for which column groups it should collect statistics, and then
collects the statistics.
5. Execute the queries previously profiled again.
The optimizer uses the statistics gathered by Oracle RMU to
make a best effort at optimizing the profiled queries.
The following list provides some guidelines on when to issue the
RMU Collect Optimizer_Statistics command and which Statistics
qualifier options you should use:
o You should enable workload profiling and execute the
RMU Collect Optimizer_Statistics command with the
Statistics=(Workload) qualifier when you introduce new,
complex, frequently used queries as part of your regular work.
o You should execute the RMU Collect Optimizer_Statistics
command with the Statistics=(Storage) qualifier after you
add metadata, such as new tables or indexes, to the database.
In this case, you do not need to reenable workload profiling.
o You should execute the RMU Collect Optimizer_Statistics
command with the Statistics=(Storage, Workload) qualifier
when the data in the database has significantly increased,
decreased, or changed. In this case, you do not need to
reenable workload profiling.
The statistics you can gather with the RMU Collect Optimizer_
Statistics command and a description of how the optimizer uses
these statistics are summarized in Statistics Gathered by the RMU
Collect Optimizer_Statistics Command.
Table 6 Statistics Gathered by the RMU Collect Optimizer_
Statistics Command
Cardinality Statistics
Statistic
Gathered: Definition: Used by Optimizer to:
Table Number of rows in Determine solution cardinality.
Cardinality a table.
Index Number of Estimate the number of index
Cardinality distinct key keys returned.
values in an
index.
Index Prefix Number of Estimate the number of index
Cardinality distinct key keys returned based on a sorted
values in leading index range.
parts of a multi-
segmented B-tree
index.
Workload Statistics
Statistic
Gathered: Definition: Used by Optimizer to:
Column Group Average number Determine strategies for
Duplicity of duplicates equiselections (selections
Factor per distinct with the IS NULL predicate
value in a column or selection predicates with
group. This is an the equals (=) operator),
estimated value. equijoins, grouped aggregation
(for example, the SQL GROUP
BY clause), or projection
operations (for example, the
SQL DISTINCT clause).
Column Group Number of table Estimate the effects of
Null Factor rows with a NULL NULL data on equijoins and
value in at least equiselections (because they
one column of a imply the removal of rows with
column group. NULL values). Also used for
This is an estimating the cardinality of
estimated value. an outer join result.
Storage Statistics
Statistic
Gathered: Definition: Used by Optimizer to:
Average Average number Estimate the cost of descending
Index Depth of levels to the B-tree. (A cross join with
(sorted traverse on a an inner table that is accessed
indexes B-tree descent. by a sorted index involves
only) repetitive B-tree descents.)
Index Key Average number of Improve the cost estimate
Clustering I/Os required of performing an index-only
Factor to read one retrieval for hashed and sorted
index key and indexes.
all associated
dbkeys during
a hashed key
lookup or a B-
tree index scan,
excluding the
B-tree descent.
Index Data Average number Estimate the cost for fetching
Clustering of I/Os required data rows from a sorted index
Factor to fetch data scan or from a hash bucket.
rows using dbkeys
associated with
an index key.
Table Row The average Estimate the cost of performing
Clustering number of I/Os a sequential scan of a table.
Factor required to read
one row during a
sequential scan
of a table.