Top 5 Timed Events Avg %Total ~~~~~~~~~~~~~~~~~~ wait Call Event Waits Time (s) (ms) Time ----------------------------------------- ------------ ----------- ------ ------ enq: US - contention 7,498 4,035 538 21.9 row cache lock 8,486 1,240 146 6.7During normal operation there was no waits on enq: US - contention and row cache lock waits were between 0 - 4. High wait events only appear during the load test when the system is stressed.
The first peaks on the following graphs corresponds to high waits on above events observed during the initial load test.
enq: US - contention
row cache lock waits
Although 1332738.1 suggested this is related to undo segments and could be seen dc_rollback_segments. But there was not much difference between this metric during problem period and a good period. Below is the problem period
Cache Requests Miss Reqs Miss Reqs Usage ------------------------- ------------ ------ ------- ----- -------- ---------- dc_objects 105,703 0.1 0 0 5,389 dc_rollback_segments 38,780 0.3 0 250 514 dc_segments 4,234 5.0 0 12 2,719 dc_tablespaces 165,248 0.0 0 0 21 dc_users 178,080 0.0 0 0 222The good period
Cache Requests Miss Reqs Miss Reqs Usage ------------------------- ------------ ------ ------- ----- -------- ---------- dc_objects 284,414 0.5 0 8 2,753 dc_rollback_segments 22,307 0.0 0 0 515 dc_segments 17,724 7.9 0 10 1,790 dc_tablespaces 142,346 0.0 0 0 21 dc_users 158,440 0.0 0 0 116Comparing the above two there's only a slight difference but comparing GES stats shows following for problem period
Cache Requests Conflicts Releases ------------------------- ------------ ------------ ------------ dc_objects 84 2 0 dc_rollback_segments 511 133 0 dc_segments 352 6 0and no requests or conflicts for dc_rollback_segments during good period.
Datafile assigned to undo tablespaces has auto extensible on and has enough free space on the disk to extend the datafile. Therefore 420525.1 and 413732.1 wasn't much of a help.
742035.1 and 7291739.8 mentions bug 7291739 which materializes in high contention on above two wait events when autotuned undo retention is in use. Therefore applied the patch for bug 7291739 and set the parameter _first_spare_parameter value to the run length of the longest running query found on v$undostat(as this is 11.1, other version may require HIGHTHRESHOLD_UNDORETENTION refer above mention notes). Running the load test again didn't show any improvement and high waits could still be seen (second peak on the above graphs).
Raised a SR. Oracle couldn't determine why the patch is not effective in reducing the high wait events and suggested another hidden parameter rollback_segment_count(also mentioned as a work around on 1332738.1) It was recommended to set a value of 1.5 times the online undo segments for this parameter.
SQL> select TABLESPACE_NAME,count(*) from DBA_ROLLBACK_SEGS where status='ONLINE' group by tablespace_name; TABLESPACE_NAME COUNT(*) --------------- ---------- UNDOTBS1 323 SYSTEM 1 UNDOTBS2 300According to Oracle the value set is for "entire instance not for undo tablespace" which I would imagine means per database and not per instance. This value act as the "lower limit for the number of undo segments online at a given time". Setting this value doesn't result in database proactively online number of undo segments as specified. It is the minimum number of undo segments to kept online and only comes into play if the number of undo segments goes beyond the value specified. So going by the above statistics the value to set would be (323 + 300) x 1.5 = 935. One more thing is that this value is not dynamic and requires a restart (Not an ideal workaround for a busy production system).
After above value is set running the load test did not result in any enq: US - contention waits. It should be noted that patch was still in place even with this parameter set, but highly unlikely that it had contributed to resolve the high waits. It is possible that rollback_segment_count alone is responsible for reducing the high waits. This would be verified once the patch is roll backed later on.
Useful metalink notes
Full UNDO Tablespace In 10gR2 [ID 413732.1]
Contention Under Auto-Tuned Undo Retention [ID 742035.1]
Automatic Tuning of Undo_retention Causes Space Problems [ID 420525.1]
How to correct performance issues with enq: US - contention related to undo segments [ID 1332738.1]
Bug 7291739 - Contention with auto-tuned undo retention or high TUNED_UNDORETENTION [ID 7291739.8]