Wednesday, October 5, 2011

Latch Free Waits

A test development running 11gR1 ( started to show high number of latch free waits during a load test.
Looking at the AWR report it could be seen that latch free waits is part of the top 5 events.
Going through the latch activity section of the AWR report it could seen Result Cache: Latch and SQL memory manager latch having the large wait times compared to other latch types.
Result Cache latch is used for controlling access to the result cache.

Used GV$RESULT_CACHE_OBJECTS to identify the objects that are being cached. In this case there was only one object which was a function. Remove the caching option from the function and ran the load test again.

In the second test the latch free waits were less and concurrency (cache buffer chains) were predominant.

Top 5 wait events shows that latch free events have decreased compared to previous time.

The high number of cache buffer chains are as a result of multiple sessions running the same sql and accessing same set of blocks (hot blocks). It is this sql that was inside the function that had used result cached feature. Since result caching was removed from the function sql was getting executed rather than result being served from the cache.

Predominant latch in the second period was cache buffer chain.

Comparison AWR report shows DB Time being reduced while DB CPU time increasing.

Wait Event comparison captures the reduction of the latch free and increase of cache buffer chain.

Latch wait time comparison

Mutex Sleep Summary in test 1.

Mutex Sleep Summary in test 2.

Comparison of the above two summaries shows during the second period mutex were obtained much quicker and far less sleeps to get the mutex.

Take home point is even though result cache is expected to reduce the load on the database by preventing the execution of the SQL and serving sessions using the cache, it could have other effects/bottlenecks/issues etc that could negate the gains.

Another event that was predominant in these test was SQL Memory Manager latch. Metalink note 9732503.8 mentions Bug 9732503 with regard to this latch in certain platforms and Oracle versions. More importantly it also states

This issue is often caused when the value of pga_aggregate_target is set too low, ie. when the statistic "over allocation count" is non zero in v$pgastat. When this is the case, increase pga_aggregate_target such that it is set to at least "maximum PGA allocated" in v$pgastat. You can find more information on tuning pga_aggregate_target in the Oracle tuning guide.

Metalink notes useful in diagnosing latch free wait events

WAITEVENT: "latch free" Reference Note [ID 34576.1]
How to Identify Which Latch is Associated with a "latch free" wait [ID 413942.1]