Thursday, May 26, 2011

11gR2 AWR runs slowly due to unrealistic system statistics

After database upgraded from 10.2.0.2 to 11.2.0.2, found AWR runs slower than 10g, the more snapshots covered the longer it takes.

Struggled for days, finally notice below system statistics in read is nor realistic to seem.
SYSP> select * from aux_stats$;

SNAME                          PNAME                               PVAL1
------------------------------ ------------------------------ ----------
PVAL2
------------------------------------------------------------------------------------------------------------------------------------
SYSSTATS_INFO                  STATUS
COMPLETED

SYSSTATS_INFO                  DSTART
02-23-2011 09:05

SYSSTATS_INFO                  DSTOP
02-23-2011 09:31

SYSSTATS_INFO                  FLAGS                                   1


SYSSTATS_MAIN                  CPUSPEEDNW                     1400.84388


SYSSTATS_MAIN                  IOSEEKTIM                              10


SYSSTATS_MAIN                  IOTFRSPEED                           4096


SYSSTATS_MAIN                  SREADTIM                         28231.75


SYSSTATS_MAIN                  MREADTIM                        24989.114


SYSSTATS_MAIN                  CPUSPEED                             1170


SYSSTATS_MAIN                  MBRC                                    6


SYSSTATS_MAIN                  MAXTHR                           96043008


SYSSTATS_MAIN                  SLAVETHR                          3421184



13 rows selected.

actions:
1.) backup system statistics and delete it. Run AWR report promptly .

SYS@ODSP> select * from aux_stats$;

SNAME                          PNAME                               PVAL1
------------------------------ ------------------------------ ----------
PVAL2
------------------------------------------------------------------------------
SYSSTATS_INFO                  STATUS
COMPLETED
SYSSTATS_INFO                  DSTART
05-26-2011 14:39
SYSSTATS_INFO                  DSTOP
05-26-2011 14:39
SYSSTATS_INFO                  FLAGS                                   0

SYSSTATS_MAIN                  CPUSPEEDNW                           1151

SYSSTATS_MAIN                  IOSEEKTIM                              10

SYSSTATS_MAIN                  IOTFRSPEED                           4096

SYSSTATS_MAIN                  SREADTIM

SYSSTATS_MAIN                  MREADTIM

SYSSTATS_MAIN                  CPUSPEED

SYSSTATS_MAIN                  MBRC

SYSSTATS_MAIN                  MAXTHR

SYSSTATS_MAIN                  SLAVETHR


13 rows selected.
2.) Gather NOWORKLOAD system statistics. Run report okay.
select * from aux_stats$;

SNAME                          PNAME                               PVAL1
------------------------------ ------------------------------ ----------
PVAL2
------------------------------------------------------------------------------
SYSSTATS_INFO                  STATUS
COMPLETED
SYSSTATS_INFO                  DSTART
05-26-2011 14:46
SYSSTATS_INFO                  DSTOP
05-26-2011 14:46
SYSSTATS_INFO                  FLAGS                                   1

SYSSTATS_MAIN                  CPUSPEEDNW                           1152

SYSSTATS_MAIN                  IOSEEKTIM                               6

SYSSTATS_MAIN                  IOTFRSPEED                          34397

SYSSTATS_MAIN                  SREADTIM

SYSSTATS_MAIN                  MREADTIM

SYSSTATS_MAIN                  CPUSPEED

SYSSTATS_MAIN                  MBRC

SYSSTATS_MAIN                  MAXTHR

SYSSTATS_MAIN                  SLAVETHR


13 rows selected.

Finally from http://structureddata.org/2008/01/02/what-are-your-system-statistics/ , I realized it is a bug.   Bug 9842771  Wrong SREADTIM and MREADTIM statistics in AUX_STATS$

The workaround is to manually set SYSTEM statistics or NOWORKLOAD system statistics.

Lesson learned again : understand what exactly you are going to do in depth. BTW, since this is a known bug, I think the complete checklist of upgrade guide should be updated.

Here are more informations about system statistics.

1. When Oracle gathers system statistics, it analyzes system activity in a specified time period (workload statistics) or simulates a workload (noworkload statistics). The statistics are collected using the DBMS_STATS.GATHER_SYSTEM_STATS procedure. Oracle Corporation highly recommends that you gather system statistics.

gathering_mode
   

Mode values are:

NOWORKLOAD: Will capture characteristics of the I/O system. Gathering may take a few minutes and depends on the size of the database. During this period Oracle will estimate the average read seek time and transfer speed for the I/O system. This mode is suitable for the all workloads. Oracle recommends to run GATHER_SYSTEM_STATS ('noworkload') after creation of the database and tablespaces. To fine tune system statistics for the workload use 'START' and 'STOP' or 'INTERVAL' options. If you gather both 'NOWORKLOAD' and workload specific (statistics collected using 'INTERVAL' or 'START' and 'STOP' ), the workload statistics will be used by optimizer. Collected components: cpuspeednw, ioseektim, iotfrspeed.

INTERVAL: Captures system activity during a specified interval. This works in combination with the interval parameter. You should provide an interval value in minutes, after which system statistics are created or updated in the dictionary or stattab. You can use GATHER_SYSTEM_STATS (gathering_mode=>'STOP') to stop gathering earlier than scheduled. Collected components: maxthr, slavethr, cpuspeed, sreadtim, mreadtim, mbrc.

START | STOP: Captures system activity during specified start and stop times and refreshes the dictionary or stattab with statistics for the elapsed period. Interval value is ignored. Collected components: maxthr, slavethr, cpuspeed, sreadtim, mreadtim, mbrc.

 -- In this case, I use START|STOP  to sample my database for 30 minutes during normal workload .


3. specs and units
CPUSPEED     Workload CPU speed in millions of cycles/second
CPUSPEEDNW     Noworkload CPU speed in millions of cycles/second
IOSEEKTIM     Seek time + latency time + operating system overhead time in milliseconds
IOTFRSPEED     Rate of a single read request in bytes/millisecond
MAXTHR     Maximum throughput that the I/O subsystem can deliver in bytes/second
MBRC     Average multiblock read count sequentially in blocks
MREADTIM     Average time for a multi-block read request in milliseconds
SLAVETHR     Average parallel slave I/O throughput in bytes/second
SREADTIM     Average time for a single-block read request in milliseconds