Feb 6, 2014

Oracle Wait Events and Solution

Different Wait Events and Solution in Oracle
-        List of popular events
-        Eradication
List of some of wait events:
-          Buffer busy wait
-          DB file sequential read
-          Enq: TX – row lock contention
-          Enq : TM – index contention
-          Row cache lock wait
-          Read by other session

1) Buffer Busy Wait:
            This wait event happens when a session tries to access a block in the buffer cache but it can't because the buffer is busy, that is another session is modifying the block and the contents of the block are in flux.

Getting more information about buffer busy waits
To get more information about the SQL statement being executed and the block being waited for, trace the session or gather data from V$SESSION V$SESSION_WAIT (or just V$SESSION in 10g and higher):

sql>
SELECT s.sql_hash_value, sw.p1 file#, sw.p2 block#, sw.p3 reason
FROM v$session_wait sw, v$session s
WHERE sw.event = 'buffer busy waits'
AND sw.sid = s.sid;


>> P1 - file number of the data file containing the block being waited for
>> P2 - block number of the block being waited for
>> P3 - this wait is called from many different sections of Oracle code and each uses their own reason code which differ among versions.

To determine the object being waited for, use the P1 (file_number) and P2 (block_number) information from the above query:

SELECT  owner , segment_name , segment_type
FROM  dba_extents
WHERE  file_id = &FileNumber
AND  &BlockNumber BETWEEN block_id AND block_id + blocks -1;


Another query that can be very useful is finding the objects in the entire Oracle database that are suffering from "buffer busy waits". The following query gives the top 10 segments:

SELECT * FROM (
   SELECT owner, object_name, subobject_name, object_type,
          tablespace_name, value
   FROM v$segment_statistics
   WHERE statistic_name='buffer busy waits' and owner not like '%SYS%'
   ORDER BY value DESC)
WHERE ROWNUM <=10;

Fixing buffer busy waits :

Once the database object is known, consider the following causes of contention and their solutions.

a) Undo Header - If using Automatic Undo Management (AUM), increase the size of the undo tablespace. If not using AUM, add more rollback segments.

b) Undo Block - If using AUM, increase size of the undo tablespace. If not using AUM, increase rollback segment sizes.

c) Data Block- Data blocks are the blocks that hold the row data in a table or index. The typical problem is that multiple sessions are requesting a block that is either not in cache or in an incompatible mode (read by other session in 10g and higher).

d) Tune inefficient queries that read too many blocks into the buffer cache. These queries could flush out blocks that may be useful for other sessions in the buffer cache. By tuning queries, the number of blocks that need to be read into the cache is minimized, reducing aging out of the existing "good" blocks in the cache.

e) Resolve Hot Blocks - If the queries above consistently return the same block or set of blocks, this is considered a hot block scenario. Delete some of the hot rows and insert them back into the table. Most of the time, the rows will be placed in a different block. The DBA may need to adjust PCTFREE and/or PCTUSED to ensure the rows are placed into a different block. Also talk with the developers to understand why a set of blocks are hot.

f) Place Table in Memory - Cache the table or keep the table in the KEEP POOL. When multiple sessions are requesting the blocks that reside in the disk, it takes too much time for a session to read it into the buffer cache. Other session(s) that need the same block will register 'buffer busy wait'. If the block is already in buffer cache, however, this possibility is eliminated. Another alternative is to increase the buffer cache size. A larger buffer cache means less I/O from disk. This reduces situations where one session is reading a block from the disk subsystem and other sessions are waiting for the block.

g) Fix Low Cardinality Indexes - Look for ways to reduce the number of low cardinality indexes, i.e. an index with a low number of unique values that could result in excessive block reads. This can especially be problematic when concurrent DML operates on table with low cardinality indexes and cause contention on a few index blocks.

h) Adjust PCTFREE/PCTUSED or use ASSM - When sessions insert/delete rows into/from a block, the block must be taken out of the freelist if the PCTFREE threshold reached. When sessions delete rows from a block, the block will be put back in the freelist if PCTUSED threshold is reached. If there are a lot of blocks coming out of the freelist or going into it, all those sessions have to make that update in the freelist map in the segment header. A solution to this problem is to create multiple freelists. This will allow different insert streams to use different freelists and thus update different freelist maps. This reduces contention on the segment header block. You should also look into optimizing the PCTUSED/PCTFREE parameters so that the blocks don't go in and out of the freelists frequently. Another solution is to use ASSM which avoids the use of freelists all together.

i) Increase Extent Size - If extents are too small, Oracle must constantly allocate new extents causing contention in the extent map

2) Db File Sequential Read :

There are two things you can do to these waits: optimize the SQL statement or reduce the average wait time.

The db file sequential read wait event has three parameters: file#, first block#, and block count. In Oracle Database 10g, this wait event falls under the User I/O wait class. Keep the following key thoughts in mind when dealing with the db file sequential read wait event.

• The Oracle process wants a block that is currently not in the SGA, and it is waiting for the database block to be read into the SGA from disk.
• The two important numbers to look for are the TIME_WAITED and AVERAGE_WAIT by individual sessions.
• Significant db file sequential read wait time is most likely an application issue.

Common Causes, Diagnosis, and Actions :

The db file sequential read wait event is initiated by SQL statements (both user and recursive) that perform single-block read operations against indexes, rollback (or undo) segments, and tables (when accessed via rowid), control files and data file headers. This wait event normally appears as one of the top five wait events, according to systemwide waits.

Physical I/O requests for these objects are perfectly normal, so the presence of the db file sequential read waits in the database does not necessarily mean that there is something wrong with the database or the application. It may not even be a bad thing if a session spends a lot of time on this event. In contrast, it is definitely bad if a session spends a lot of time on events like enqueue or latch free. This is where this single-block read subject becomes complicated. At what point does the db file sequential read event become an issue? How do you define excessive? Where do you draw the line? These are tough questions, and there is no industry standard guideline. You should establish a guideline for your environment. For example, you may consider it excessive when the db file sequential read wait represents a large portion of a process response time. Another way is to simply adopt the nonscientific hillbilly approach—that is, wait till the users start screaming.

You can easily discover which session has high TIME_WAITED on the db file sequential read wait event from the V$SESSION_EVENT view. The TIME_WAITED must be evaluated with the LOGON_TIME and compared with other nonidle events that belong to the session for a more accurate analysis. Sessions that have logged on for some time (days or weeks) may accumulate a good amount of time on the db file sequential read event. In this case, a high TIME_WAITED may not be an issue. Also, when the TIME_WAITED is put in perspective with other nonidle events, it prevents you from being blindsided. You may find another wait event which is of a greater significance. Based on the following example, SID# 192 deserves your attention and should be investigated:

sql >
select b.sid,
       nvl(substr(a.object_name, 1, 30),
           'P1=' || b.p1 || 'P2=' || b.p2 || 'P3=' || b.p3) "Object_Name",
       a.subobject_name,
       a.object_type
  from dba_objects a, v$session_wait b, x$bh c
 where c.obj = a.object_id(+)
   and b.p1 = c.file#(+)
   and b.p2 = c.dbablk(+)
   and b.event = 'db file sequential read'
UNION
select b.sid,
       nvl(substr(a.object_name, 1, 30),
           'P1=' || b.p1 || 'P2=' || b.p2 || 'P3=' || b.p3) "Object_Name",
       a.subobject_name,
       a.object_type
  from dba_objects a, v$session_wait b, x$bh c
 where c.obj = a.object_id(+)
   and b.p1 = c.file#(+)
   and b.p2 = c.dbablk(+)
   and b.event = 'db file sequential read';

3) Enq: TX - row lock contention :

A wait for the Oracle TX enqueue in mode 6 (row lock contention) is a common enqueue wait, and occurs when a transaction tries to update or delete rows that are currently locked by another transaction.

A wait for the TX enqueue in mode 6 (P1 = 1415053318, P1RAW = 54580006) is the most common enqueue wait. (In Oracle Database 10g/ 11g, the wait event name is enq: TX—row lock contention.) This indicates contention for row-level lock. This wait occurs when a transaction tries to update or delete rows that are currently locked by another transaction. This usually is an application issue. The waiting session will wait until the blocking session commits or rolls back its transaction. There is no other way to release the lock. (Killing the blocking session will cause its transaction to be rolled back.)

The following listing shows an example of TX enqueue wait in mode 6 as seen in the V$LOCK view:

Whenever you see an enqueue wait event for the TX enqueue, the first step is to find out who the blocker is and if there are multiple waiters for the same resource by using the following query. If the blocking session is an ad-hoc process, then the user may be taking a break. In this case, ask the user to commit or roll back the transaction. If the blocking session is a batch or OLTP application process, then check to see if the session is still “alive.” It may be a live Oracle session, but its parent process may be dead or hung. In this case, chances are you will have to kill the session to release the locks. Be sure to confirm with the application before killing a production process.

Drilling the issue :

SQL>
select /*+ ordered */
a.sid "Blocker_SID",a.username "UserName",a.serial#,a.logon_time,
b.type,b.lmode "Mode_Held",b.ctime "Time_Held",c.sid "Waiter_SID",c.request,c.ctime "Time_Waited"
from v$lock b,v$enqueue_lock c, v$session a
where a.sid=b.sid
and b.id1=c.id1(+)
and b.id2=c.id2(+)
and c.type(+)='TX'
and b.type='TX'
and b.block=1
order by b.ctime,c.ctime;

You can discover the resource that is being competed for. The resource ID is available in the V$LOCK.ID1 column of the DML lock (TM) for that transaction. It is also available in the V$SESSION.ROW_WAIT_OBJ# of the waiting session. The following query retrieves the resource of the TX enqueue wait:


select c.sid "Waiter_SID",a.object_name,a.object_type
from dba_objects a, v$session b, v$session_wait c
where (a.object_id=b.row_wait_obj# OR a.data_object_id=b.row_wait_obj#)
and b.sid=c.sid
and chr(bitand(c.p1,-16777216)/16777215)||chr(bitand(c.p1,16711680)/65535) = 'TX'
and c.event='enquue';

Don’t forget to extract the SQL statement that is executed by the waiting session as well as the SQL statement that is executed by the blocking session. These statements will give you an idea of what the waiting session is trying to do and what the blocking session is doing. They are also important points of reference for the application developers so that they can quickly locate the modules. (By the way, the SQL statement that is currently being executed by the blocking session is not necessarily the statement that holds the lock. The statement that holds the lock might have been run a long time ago.)

4) Enq: TM - index contention :

Waits on enq: TM - contention in Oracle indicate there are unindexed foreign key constraints. In this article, I examine how foreign key constraints relate to this wait event and how to tune for this event.

Recently, I was assisting my company to diagnose sessions waiting on the "enq: TM - contention" event. The blocked sessions were executing simple INSERT statements similar to:

INSERT INTO supplier VALUES (:1, :2, :3);Waits on enq: TM - contention indicate there are unindexed foreign key constraints. Reviewing the SUPPLIER table, we found a foreign key constraint referencing the PRODUCT table that did not have an associated index. This was also confirmed by the Top Objects feature of Ignite for Oracle because all the time was associated with the PRODUCT table. We added the index on the column referencing the PRODUCT table and the problem was solved.

Finding the root cause of the enq: TM - contention wait event

Oracle's locking feature to find the blocking sessions, we found the real culprit. Periodically, as the company reviewed its vendor list, they "cleaned up" the SUPPLIER table several times a week. As a result, rows from the SUPPLIER table were deleted. Those delete statements were then cascading to the PRODUCT table and taking out TM locks on it.

Reproducing a typical problem that leads to this wait : - A demo

This problem has a simple fix, but I wanted to understand more about why this happens. So I reproduced the issue to see what happens under the covers. I first created a subset of the tables from this customer and loaded them with sample data.

CREATE TABLE supplier
( supplier_id number(10) not null,
supplier_name varchar2(50) not null,
contact_name varchar2(50),
CONSTRAINT supplier_pk PRIMARY KEY (supplier_id)
);

INSERT INTO supplier VALUES (1, 'Supplier 1', 'Contact 1');
INSERT INTO supplier VALUES (2, 'Supplier 2', 'Contact 2');
COMMIT;

CREATE TABLE product
( product_id number(10) not null,
product_name varchar2(50) not null,
supplier_id number(10) not null,
CONSTRAINT fk_supplier
FOREIGN KEY (supplier_id)
REFERENCES supplier(supplier_id)
ON DELETE CASCADE );

INSERT INTO product VALUES (1, 'Product 1', 1);
INSERT INTO product VALUES (2, 'Product 2', 1);
INSERT INTO product VALUES (3, 'Product 3', 2);
COMMIT;

I then executed statements similar to what we found at this customer:

User 1: DELETE supplier WHERE supplier_id = 1;
User 2: DELETE supplier WHERE supplier_id = 2;
User 3: INSERT INTO supplier VALUES (5, 'Supplier 5', 'Contact 5');

Similar to the customer's experience, User 1 and User 2 hung waiting on "enq: TM - contention". Reviewing information from V$SESSION I found the following:


SELECT l.sid, s.blocking_session blocker, s.event, l.type, l.lmode, l.request, o.object_name, o.object_type
FROM v$lock l, dba_objects o, v$session s
WHERE UPPER(s.username) = UPPER('&User')
AND l.id1 = o.object_id (+)
AND l.sid = s.sid
ORDER BY sid, type;



Following along with the solution we used for our customer, we added an index for the foreign key constraint on the SUPPLIER table back to the PRODUCT table:

CREATE INDEX fk_supplier ON product (supplier_id);

When we ran the test case again everything worked fine. There were no exclusive locks acquired and hence no hanging. Oracle takes out exclusive locks on the child table, the PRODUCT table in our example, when a foreign key constraint is not indexed.

Sample query to find unindexed foreign key constraints
Now that we know unindexed foreign key constraints can cause severe problems, here is a script that I use to find them for a specific user (this can easily be tailored to search all schemas):

SELECT *
  FROM (SELECT c.table_name, cc.column_name, cc.position column_position
          FROM user_constraints c, user_cons_columns cc
         WHERE c.constraint_name = cc.constraint_name
           AND c.constraint_type = 'R'
        MINUS
        SELECT i.table_name, ic.column_name, ic.column_position
          FROM user_indexes i, user_ind_columns ic
         WHERE i.index_name = ic.index_name)
 /*where table_name not like '%$%'*/ –- use to find application objects
 ORDER BY table_name, column_position;


5) Row Cache Lock Wait :

In order for DDL (Data Definition Language) to execute, it must acquire a row cache lock to lock the Data Dictionary information. The shared pool contains a cache of rows from the data dictionary that helps reduce physical I/O to the data dictionary tables and allows locking of individual data dictionary rows.

A closer look at the row cache lock wait event:

Each row cache lock will be on a specific data dictionary object. This is called the enqueue type and can be found in the v$rowcache view. In this sample select from v$rowcache you can find the enqueue types and the type of activity being performed within the dictionary cache.

PARAMETER COUNT GETS GETMISSES MODIFICATIONS
--------------------- ----- ------------ ---------- -------------
dc_free_extents 0 0 0 0
dc_used_extents 0 0 0 0
dc_segments 5927 131379921 4142831 693734
dc_tablespaces 22 188609668 2436 0
dc_tablespace_quotas 12 22779303 3843 0
dc_files 0 165961 22493 21
dc_users 19 145681559 2078 21
dc_rollback_segments 67 3906307 66 232
dc_objects 1927 70725250 2247804 74803
dc_sequences 4 142714 1599 142714

Common row cache enqueue lock types:

The tuning of the row cache lock wait is dependent upon the activity for each of the enqueue types. Of these, the most common are:

•DC_SEQUENCES:  this row cache lock wait may occur during the use of sequences. Tune by checking sequences to see if they have the cache option specified and if that cache value is reflective of the anticipated simultaneous inserts by the application.
•DC_USED_EXTENTS and DC_FREE_EXTENTS: this row cache lock wait may occur during space management operations where tablespaces are fragmented or have inadequate extent sizes. Tune by checking whether tablespaces are fragmented, extent sizes are too small, or tablespaces are managed manually.
•DC_TABLESPACES: this row cache lock wait may occur during the allocation of new extents. If extent sizes are set too low the application may frequently request new extents which could cause contention. Tune by checking for rapidly increasing number of extents.
•DC_OBJECTS: this row cache lock wait may occur during the recompilation of objects. If object compiles are occurring this can require an exclusive lock which will block other activity. Tune by examining invalid objects and dependencies.

Tuning for the row cache lock wait event:

The row cache lock wait event is associated with a specific enqueue type on a data dictionary row. Checking activity within the V$ROWCACHE view is a good place to start for understanding this relationship, as tuning can only be accomplished with analysis of the enqueue type.

If a trace file is available you may also see the following error:

 >> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! <<

Also realize that the row cache lock wait event may appear more frequently when using RAC. This is because the library cache and the row cache are global in RAC—causing the row cache lock wait to be more pronounced.

6) Read by Other Session :

When a session waits on the "read by other session" event, it indicates a wait for another session to read the data from disk into the Oracle buffer cache. If this happens too often the performance of the query or the entire database can suffer. Typically this is caused by contention for "hot" blocks or objects so it is imperative to find out which data is being contended for. Once that is known, there are several alternative methods for solving the issue.

When information is requested from the database, Oracle will first read the data from disk into the database buffer cache. If two or more sessions request the same information, the first session will read the data into the buffer cache while other sessions wait. In previous versions this wait was classified under the "buffer busy waits" event. However, in Oracle 10.1 and higher this wait time is now broken out into the "read by other session" wait event. Excessive waits for this event are typically due to several processes repeatedly reading the same blocks, e.g. many sessions scanning the same index or performing full table scans on the same table. Tuning this issue is a matter of finding and eliminating this contention.

Finding the contentions :

When a session is waiting on the "read by other session" event, an entry will be seen in the v$session_wait system view, which will give more information on the blocks being waited for:

SELECT p1 "file#", p2 "block#", p3 "class#"
FROM v$session_wait
WHERE event = 'read by other session';

If information collected from the above query repeatedly shows that the same block (or range of blocks) is experiencing waits, this indicates a "hot" block or object. The following query will give the name and type of the object:

SELECT relative_fno, owner, segment_name, segment_type
 FROM dba_extents
 WHERE file_id = &file
 AND &block BETWEEN block_id AND block_id + blocks - 1;

Eliminating contentions:
Depending on the Oracle database environment and specific performance situation the following variety of methods can be used to eliminate contention:

Tune inefficient queries - This is one of those events you need to "catch in the act" through the v$session_wait view as prescribed above. Then, since this is a disk operating system issue, take the associated system process identifier (c.spid) and see what information you can obtain from the operating system.

Redistribute data from the hot blocks - Deleting and reinserting the hot rows will often move them to a new data block. This will help decrease contention for the hot block and increase performance. More information about the data residing within the hot blocks can be retrieved with queries similar to the following:

SELECT data_object_id
 FROM dba_objects
 WHERE owner='&owner' AND object_name='&object';

SELECT dbms_rowid.rowid_create(1,<data_object_id>,<relative_fno>,<block>,0) start_rowid
 FROM dual;

--rowid for the first row in the block

 SELECT dbms_rowid.rowid_create(1,<data_object_id>,<relative_fno>,<block>,500) end_rowid FROM dual;

--rowid for the 500th row in the block

 SELECT <column_list>
 FROM <owner>.<segment_name>
 WHERE rowid BETWEEN <start_rowid> AND <end_rowid>


15 comments:

  1. Very Helpful and well explained

    ReplyDelete
  2. This comment has been removed by a blog administrator.

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete
  4. This comment has been removed by a blog administrator.

    ReplyDelete
  5. This comment has been removed by a blog administrator.

    ReplyDelete
  6. This comment has been removed by a blog administrator.

    ReplyDelete
  7. This comment has been removed by a blog administrator.

    ReplyDelete
  8. This comment has been removed by a blog administrator.

    ReplyDelete
  9. This comment has been removed by a blog administrator.

    ReplyDelete
  10. Outstanding read, I just passed this onto a colleague who was doing a little investigation on that. And he actually bought me lunch because I discovered it for him smile So let me rephrase that: Thanks creditcardgenerator

    ReplyDelete
  11. Wow, the information you shared is absolutely amazing! Thank you so much for that. Speaking of sharing, I came across a really helpful link about color blindness that I just have to pass on to you. Click on the link above and check it out! Keep up the great work on your blog, and keep us in the loop. Cheers!
    I have one more information related with roblox..Visit here

    ReplyDelete

Translate >>