Apr 17, 2015

Finding RCA "Why Shutdown Immediate hangs or taking longer time?"

RCA- shutdown hang

As per my experience I found hang situation so many times. What method I followed and used to analyze, I am just giving some steps. It may help you when you face this kind of situation.

To check reason why shutdown immediate hangs:

SQL>connect / as SYSDBA
SQL>Select * from x$ktuxe where ktuxecfl = 'DEAD';

This shows dead transactions that SMON is looking to rollback.

Now Plan to shutdown again and gather some information. Before issuing the shutdown immediate command set some events as follows:

SQL>alter session set events '10046 trace name context forever, level 12';
SQL>alter session set events '10400 trace name context forever, level 1';
SQL>shutdown immediate;

10046 turns on extended SQL_TRACE for the shutdown process.
10400 dumps a system state every 5 minutes.

The trace files should show where the time is going. To check the progress of SMON is very important in 
this case. You can find it with the below query.

SELECT r.NAME "RB Segment Name",
       dba_seg.size_mb,
       DECODE(TRUNC(SYSDATE - LOGON_TIME),
              0,
              NULL,
              TRUNC(SYSDATE - LOGON_TIME) || ' Days' || ' + ') ||
       TO_CHAR(TO_DATE(TRUNC(MOD(SYSDATE - LOGON_TIME, 1) * 86400), 'SSSSS'),
               'HH24:MI:SS') LOGON,
       v$session.SID,
       v$session.SERIAL#,
       p.SPID,
       v$session.process,
       v$session.USERNAME,
       v$session.STATUS,
       v$session.OSUSER,
       v$session.MACHINE,
       v$session.PROGRAM,
       v$session.module,
       action
  FROM v$lock l,
       v$process p,
       v$rollname r,
       v$session,
       (SELECT segment_name, ROUND(bytes / (1024 * 1024), 2) size_mb
          FROM dba_segments
         WHERE segment_type = 'TYPE2 UNDO'
         ORDER BY bytes DESC) dba_seg
 WHERE l.SID = p.pid(+)
   AND v$session.SID = l.SID
   AND TRUNC(l.id1(+) / 65536) = r.usn
      -- AND l.TYPE(+) = 'TX' AND
      -- l.lmode(+) = 6
   AND r.NAME = dba_seg.segment_name
--AND v$session.username = 'SYSTEM'
--AND status = 'INACTIVE'
 ORDER BY size_mb DESC;


Cause: 

Shut down immediate may hang because of various reasons.
  • §         Processes still continue to be connected to the database and do not terminate.
  • §         SMON is cleaning temp segments or performing delayed block cleanouts.
  • §         Uncommitted transactions are being rolled back.

Debugging a hung database in oracle version 11g:

Back in oracle 10g a hung database was real problem, especially could not connect via SQL*plus release the source of the hanging. There is a new feature in Oracle 11g SQL*Plus called the “prelim” option. This option is very useful for running oradebug and other utilities that do not require a real connection to the database.

$ sqlplus –prelim
-or- in SQL you can set

-- In windows systems,

SQL>Set _prelim on
SQL>connect / as sysdba

Now you are able to run oradebug commands to diagnose a hung database issue:

SQL> oradebug hanganalyze 3

Wait at least 2 minutes to give time to identify process state changes.

SQL>oradebug hanganalyze 3

Open a separate SQL session and immediately generate a system state dump.

SQL>alter session set events 'immediate trace name SYSTEMSTATE level 10';

How to Check why shutdown immediate taking longer time to shutdown?

When shutdown immediate taking longer time as compare to the normal time usually it is taking. You must perform following task before performing actual shutdown immediate.
  1. All active session.
  2. Temporary Tablespace Recover.
  3. Long Running Query in Database.
  4. Large Transaction.
  5. Progress of the Transaction that oracle is recovering.
  6. Parallel Transaction Recovery.
SQL> Select sid, serial#, username, status, schemaname, logon_time
  from v$session
 where status = 'ACTIVE'
   and username is not null;


If Active session is exist then, try to find out what is doing in the database by this session. Active session makeshutdown slower

SQL> Select f.R "Recovered", u.nr "Need Recovered"
  from (select count(block#) R, 1 ch from sys.fet$) f,
       (select count(block#) NR, 1 ch from sys.uet$) u
 where f.ch = u.ch;


Check to see any long query is running into the database while you are trying to shutdown the database.

SQL> Select * from v$session_longops where time_remaining>0 order by username;

Check to ensure large transaction is not going on while you are trying to shutdown the database.

SQL> Select sum(used_ublk) from v$transaction;

Check the progress of the transaction that oracle is recovering.

SQL> Select * from v$fast_start_transactions;

Check to ensure that any parallel transaction recovery is going on before performing shutdown immediate.

SQL> Select * from v$fast_start_servers;

Finally if you do not understand the reason why the shutdown is hanging or taking longer time to shutdown then try to shutdown your database with ‘abort’ option and startup with ‘restrict’ option and try shutdown with ‘immediate’ option.

Check the alert.log, if you find any error related ‘Thread 1 cannot allocate new log, sequence’ then you need to enable your archival process. Your archival is disable due to any reason.

Process:

1. In command prompt set the oracle_sid first
$ export ORACLE_SID=<SID>

2. Now start the SQL*plus:
$ sqlplus /nolog
SQL>connect sys/***@instance_name
SQL>Select instance_name from v$instance;

3. Try to checkpoint before shutdown abort
SQL>alter system checkpoint;
SQL> shutdown abort;

4. Start the database with ‘restrict’ option so that no other user is able to connect you in the mean time.
SQL>startup restrict;
SQL>select logins from v$instance;

--> RESTRICTED

SQL>shutdown immediate;

5. Mount the database and ensure archive process is enabling by using archive log list command. If it is disabling then enable it.
SQL>startup mount;
SQL> archive log list;  --if disable then enable it
SQL>Alter database archivelog;
SQL> Alter system archive log start;

Note: If your archivelog destination and format is already set no need to set again. After setting check with the ‘archive log list’ command archival is enable or not.

SQL> alter database open;
Now check if your database is still in restricted mode then remove the restriction.

SQL>select logins from v$instance;
SQL>alter system disable restricted session;

Note: Now try to generate archivelog with any command

SQL>alter system archivelog current;
SQL>alter system switch logfile;

Now try to check or perform normal shutdown and startup with the database

Now try to check or perform normal shutdown and startup with the database

Reference :

Ref. 1076161.6: Shutdown immediate or shutdown Normal hangs. SMON disabling TX recovery
Ref. Note 375935.1: What to do and not to do when shutdown immediate hangs.
Ref. Note 428688.1: Shutdown immediate very slow to close database.

No comments:

Post a Comment

Translate >>