The purpose of collecting optimizer statistics is to maintain up-to-date statistics that the Oracle Rdb optimizer uses to determine solution costs and cardinalities during query optimization. You can collect cardinality and storage statistics by issuing the RMU Collect Optimizer_Statistics command. You can direct Oracle RMU to collect these statistics for particular tables or indexes by using the Tables, System_Relations, or Indexes qualifiers. Before you can collect workload statistics, you must first generate a workload profile with SQL. The following list describes the general procedure for generating a workload profile and collecting workload statistics: 1. Enable workload profiling with the WORKLOAD COLLECTION IS ENABLED clause of the SQL ALTER DATABASE or SQL CREATE DATABASE statement. SQL creates a new system table called RDB$WORKLOAD. 2. Execute the queries for which you want the Oracle Rdb optimizer to have the best possible statistics. When you execute the queries, the optimizer determines which groups of columns are important for optimal processing of the query. These groups of columns are referred to as workload column groups. Note that a workload column group may actually contain only one column. Each set of workload column groups is entered as a row in the RDB$WORKLOAD system table. At this point, the only data in the RDB$WORKLOAD system table are the workload column groups, the tables with which the column group is associated, and the date they were entered into the table. No statistics are currently recorded in the RDB$WORKLOAD system table. 3. In most cases, now you disable workload profiling with the SQL ALTER DATABASE WORKLOAD COLLECTION IS DISABLED clause. Queries executed after you disable workload profiling are not scanned by the Oracle Rdb optimizer for workload column groups. You can leave the workload profiling enabled if the same queries are always executed. In such a case, no new rows are entered into the RDB$WORKLOAD system table. However, if you anticipate that queries will be executed for which you do not want workload profiling to be enabled, you need to disable workload collection. 4. Execute an RMU Collect Optimizer_Statistics command with the Statistics=(Workload) qualifier. Oracle RMU reads the RDB$WORKLOAD system table to determine for which column groups it should collect statistics, and then collects the statistics. 5. Execute the queries previously profiled again. The optimizer uses the statistics gathered by Oracle RMU to make a best effort at optimizing the profiled queries. The following list provides some guidelines on when to issue the RMU Collect Optimizer_Statistics command and which Statistics qualifier options you should use: o You should enable workload profiling and execute the RMU Collect Optimizer_Statistics command with the Statistics=(Workload) qualifier when you introduce new, complex, frequently used queries as part of your regular work. o You should execute the RMU Collect Optimizer_Statistics command with the Statistics=(Storage) qualifier after you add metadata, such as new tables or indexes, to the database. In this case, you do not need to reenable workload profiling. o You should execute the RMU Collect Optimizer_Statistics command with the Statistics=(Storage, Workload) qualifier when the data in the database has significantly increased, decreased, or changed. In this case, you do not need to reenable workload profiling. The statistics you can gather with the RMU Collect Optimizer_ Statistics command and a description of how the optimizer uses these statistics are summarized in Statistics Gathered by the RMU Collect Optimizer_Statistics Command. Table 6 Statistics Gathered by the RMU Collect Optimizer_ Statistics Command Cardinality Statistics Statistic Gathered: Definition: Used by Optimizer to: Table Number of rows in Determine solution cardinality. Cardinality a table. Index Number of Estimate the number of index Cardinality distinct key keys returned. values in an index. Index Prefix Number of Estimate the number of index Cardinality distinct key keys returned based on a sorted values in leading index range. parts of a multi- segmented B-tree index. Workload Statistics Statistic Gathered: Definition: Used by Optimizer to: Column Group Average number Determine strategies for Duplicity of duplicates equiselections (selections Factor per distinct with the IS NULL predicate value in a column or selection predicates with group. This is an the equals (=) operator), estimated value. equijoins, grouped aggregation (for example, the SQL GROUP BY clause), or projection operations (for example, the SQL DISTINCT clause). Column Group Number of table Estimate the effects of Null Factor rows with a NULL NULL data on equijoins and value in at least equiselections (because they one column of a imply the removal of rows with column group. NULL values). Also used for This is an estimating the cardinality of estimated value. an outer join result. Storage Statistics Statistic Gathered: Definition: Used by Optimizer to: Average Average number Estimate the cost of descending Index Depth of levels to the B-tree. (A cross join with (sorted traverse on a an inner table that is accessed indexes B-tree descent. by a sorted index involves only) repetitive B-tree descents.) Index Key Average number of Improve the cost estimate Clustering I/Os required of performing an index-only Factor to read one retrieval for hashed and sorted index key and indexes. all associated dbkeys during a hashed key lookup or a B- tree index scan, excluding the B-tree descent. Index Data Average number Estimate the cost for fetching Clustering of I/Os required data rows from a sorted index Factor to fetch data scan or from a hash bucket. rows using dbkeys associated with an index key. Table Row The average Estimate the cost of performing Clustering number of I/Os a sequential scan of a table. Factor required to read one row during a sequential scan of a table.