Sep 26, 2013

Fix : SMON process consuming more CPU time and degrading performance

~ Tested in Oracle 11gR2 environment

Issue Faced :

I met with unusual situation where a few users complained about query hanging. But the cause of this hang is not the size of table o (the size were rather small; 50000+ rows) or any kind of query plan change. 

We proceeded to check the locks and found the oracle process SMON was blocking as BLOCK column for V$LOCK showed 1 and with LOCK MODE 4. Further, LOGFILE SWITCH command was executed and the command did not return the prompt. There was no space issue either with TEMP tablespace or LOGARCHIVE destination. 

May be because of SMON locking, the transactions are getting locked. I also fired some random queries on some other table, but for me, the query returned results in usual time. Now, to prevent this situation from cropping up again, I wish to find the root-cause. What may be the possibilities that caused this SMON to block. Oracle was bounced and operation commenced normally. 

About SMON:

The system monitor process (SMON) is in charge of a variety of system-level cleanup duties. The duties assigned to SMON include:
  • Performing instance recovery, if necessary, at instance startup. In an Oracle RAC database, the SMON process of one database instance can perform instance recovery for a failed instance.
  • Recovering terminated transactions that were skipped during instance recovery because of file-read or tablespace offline errors. SMON recovers the transactions when the tablespace or file is brought back online.
  • Cleaning up unused temporary segments. For example, Oracle Database allocates extents when creating an index. If the operation fails, then SMON cleans up the temporary space.
  • Coalescing contiguous free extents within dictionary-managed tablespaces.
SMON checks regularly to see whether it is needed. Other processes can call SMON if they detect a need for it.

Solution :
If SMON process consuming more time in database, check if any role back entries are there or not. I am sure there are rollback entries set for the hidden parameter. Make it zero '0' to release database time spend on SMON process and improve the performance.

Note: Don't touch any hidden parameter in production environment without Oracle's recommendation.

-- Drilling: Collect below values for the given queries:
1) show parameter fast_start_parallel_rollback;
2) select name,value from v$parameter where name='fast_start_parallel_rollback';
3) select * from v$parameter where name like '%distributed_lock_timeout%';
4) select pid, program from v$process where program like '%SMON%';

FAST_START_PARALLEL_ROLLBACK:

SQL> show parameter fast_start_parallel_rollback;
Output : FALSE

This is one of the cause for which SMON consumed more resource. Make it HIGH or LOW. High is recommended in Production environments where more head room available for CPU/ Cores. Parameter setting FALSE will do serial recovery. Some environments FALSE is recommended also.

FAST_START_PARALLEL_ROLLBACK specifies the degree of parallelism used when recovering terminated transactions. Terminated transactions are transactions that are active before a system failure. If a system fails when there are uncommitted parallel DML or DDL transactions, then you can speed up transaction recovery during startup by using this parameter.
Values:
  • FALSE : Parallel rollback is disabled
  • LOW   : Limits the maximum degree of parallelism to 2 * CPU_COUNT
  • HIGH  : Limits the maximum degree of parallelism to 4 * CPU_COUNT
distributed_lock_timeout:
It specifies the amount of time (in seconds) for distributed transactions to wait for locked resources. Check the default value is set or not ( as 60 seconds). This can be verified as per your transactions and with oracle support. Here I found 60. It is ok for me.

Now Find SMON sessions:
SQL> select pid, program from v$process where program like '%SMON%';
I found more sessions. i.e., Sessions are trying to recover. SMON is doing more house keeping activities. If any transaction session is killed or stopped at middle, this situation may occur. I did the following and found good performance in next restart.

Set "_cleanup_rollback_entries":

SQL> alter system set "_cleanup_rollback_entries"=0 scope=spfile sid='*';

After setting this, Database must be bounced.

Hope your performance may improved. Default value for  "_cleanup_rollback_entries" parameter is 100. But don't change if there is no SMON recovery issues in your environment. Must be changed with Oracle support consultation. AS per one of my prod env. requirement I changed to 400 also to speedup recovery for dead transactions.

To see hidden parameter  and value set in your database:

SELECT a.ksppinm  Param,

       b.ksppstvl SessionVal,

       c.ksppstvl InstanceVal,

       a.ksppdesc Descr

  FROM x$ksppi a, x$ksppcv b, x$ksppsv c

 WHERE a.indx = b.indx

   AND a.indx = c.indx

   AND a.ksppinm LIKE '/_%' escape '/'

 ORDER BY 1;


Note: Do above with your own risk. Best option is raise SR with Oracle support and follow the recommendations.

DBAshhh.........njy......

No comments:

Post a Comment

Translate >>