The UPDATE STATISTICS statement (update_statistics_statement) defines the storage requirements of tables and indexes as well as the value distribution of columns, and stores this information in the database catalog.
<update_statistics_statement> ::=
UPDATE STAT[ISTICS] COLUMN <table_name>.<column_name> [ESTIMATE
[<sample_definition>]]
| UPDATE STAT[ISTICS] COLUMN (<column_name>,...) FOR <table_name>
[ESTIMATE [<sample_definition>]]
| UPDATE STAT[ISTICS] COLUMN (*) FOR <table_name> [ESTIMATE
[<sample_definition>]]
| UPDATE STAT[ISTICS] <table_name> [ESTIMATE
[<sample_definition>]]
| UPDATE STAT[ISTICS] [<schema_name>.][<identifier>]* [ESTIMATE
[<sample_definition>]]
| UPDATE STAT[ISTICS] AS PER SYSTEM TABLE
When the update statistics statement is executed, information on the table, such as the number of rows, the number of pages used, the sizes of indexes, the value distribution within columns or indexes, and so on, is stored in the database catalog. These values are used by the SQL Optimizer to determine the best search strategy for executing complex SQL statements.
When value distributions are determined, a maximum of 1022 bytes of any column value is considered, regardless of how many bytes the column value can – according to its definition – in fact use. This means that the value distributions that are determined can differ from the actual value distributions if there are column values that are identical for the first 1022 bytes and are only different after that point.
The update statistics statement implicitly performs a COMMIT statement for each base table; i.e. the transaction within which the update statistics statement has been executed is closed.
The statistical values stored in the database catalog can be retrieved by selecting the system table OPTIMIZERINFORMATION. Each row of the table describes statistical values of indexes, columns or the size of a table.
If a table name is specified, the table must be a base table and must not be a temporary table. The current database user must have a privilege for the table.
If a column name is specified, this column must exist in specified table.
If * is specified, all columns in the table are assumed.
Specifying <identifier>* has the same effect as specifying the update statistics statement for all base tables for which the current database user has a privilege and whose table name begins with the identifier.
The database system administrator (SYSDBA user) can use UPDATE STATISTICS * to execute the update statistics statement for all base tables, even if the SYSDBA user has not been assigned a privilege for these tables.
● If ESTIMATE and a sample definition are specified, the database system estimates the statistical values by selecting data at random. The number of random selects can be given as number of rows or as percentage. The runtime of the update statistics statement can be considerably reduced by specifying ESTIMATE. In most cases, the precision of the statistical values determined is sufficient.
● If ESTIMATE is specified without a sample definition, the database system estimates the statistical values by selecting data at random. The number of random selects was defined with the CREATE TABLE statement or an ALTER TABLE statement by means of a sample_definition for the specified table. The runtime of the update statistics statement can be considerably reduced by specifying ESTIMATE. In most cases, the precision of the statistical valuesdetermined is sufficient.
● If ESTIMATE is not specified, the database system determines exact statistical values by considering the complete data of the table. For large tables, the runtime can be considerably long.
The database kernel enters those table columns in the system table SYSUPDSTATWANTED (located in the schema of the database administrator) for which an UPDATE STATISTICS run would appear to be useful.
If you use the SQL statement UPDATE STATISTICS AS PER SYSTEM TABLE, an UPDATE STATISTICS run is triggered for all the table (columns) contained in SYSUPDSTATWANTED. If the database parameter UPDATESTAT_PARALLEL_SERVERS is set to a value >= 0, the system attempts to execute the statistics runs in parallel. Subsequently, the entries in table SYSUPDSTATWANTED are deleted.
The system table SYSUPDSTATWANTED can also be changed by database administrators. However, a database administrator can only insert entries that reference tables owned by that database administrator.
You can exclude individual tables from an UPDATE STATISTICS run by setting a sample size of 0 (SAMPLE definition) for these tables with an ALTER TABLE statement.
See also:
EXPLAIN Statement (explain_statement)
Concepts of the Database System, SQL Optimizer, SQL Optimizer Statistics, EXPLAIN Statement
Database Administration in CCMS, SQL Optimizer in CCMS, Scheduling Updates of the SQL Optimizer Statistics