Jan 21, 2014

SQL & PL/SQL Performance Tuning – for Beginners

-- Performance Tuning – for Beginners
-- Using Oracle SQL & PL/SQL

1) Introduction:
Performance tuning is a major part in a database, as well as database maintenance. As the transaction data increases performance degrades, PL/SQL enhancements becomes mandatory for a database. Any Online transaction processing(OLTP) systems have both real time and MIS related reports along with huge transactions. So, performance tuner has a major role in  OLTP systems. 
2) About Oracle Execution Plan:
An explain plan is a representation of the access path that is taken when a query is executed within Oracle.

3) Gather schema/ Table statistics

exec DBMS_STATS.gather_schema_stats('HR');
exec DBMS_STATS.gather_table_stats('HR','employee');

Let us discuss rules on performance tuning. See the examples wherever given.

1 : Avoid Using * in SELECT Clauses
 The dynamic SQL column reference (*) gives you a way to refer to all of the columns of a table. Do not use the * feature because it is very inefficient -- the * has to be converted to each column in turn. The SQL parser handles all the field references by obtaining the names of valid columns from the data dictionary and substitutes them on the command line, which is time consuming.

2: Reduce the Number of Trips to the Database
Every time a SQL statement is executed, ORACLE needs to perform many internal processing steps; the statement needs to be parsed, indexes evaluated, variables bound, and data blocks read. The more you can reduce the number of database accesses, the more overhead you can save.
For example:
There are 3 distinct ways of retrieving data about employees who have employee numbers 0342 or 0291.
 Method 1 (Least Efficient) :
SELECT EMP_NAME, SALARY, GRADE
FROM EMP
WHERE EMP_NO = 0342;

SELECT EMP_NAME, SALARY, GRADE
FROM EMP
WHERE EMP_NO = 0291;

Method 2 (Most Efficient) :
SELECT EMP_NAME, SALARY, GRADE
FROM EMP
WHERE EMP_NO = 0342 OR EMP_NO = 0291;
         
Note: One simple way to increase the number of rows of data you can fetch with one database access and thus reduce the number of physical calls needed is to reset the ARRAYSIZE parameter in SQL*Plus, Suggested value is 200.

3: Use TRUNCATE instead of DELETE
 When rows are removed from a table, under normal circumstances, the rollback segments are used to hold undo information; if you do not commit your transaction, Oracle restores the data to the state it was in before your transaction started.
With TRUNCATE, no undo information is generated. Once the table is truncated, the data cannot be recovered back. It is faster and needs fewer resources.
Use TRUNCATE rather than DELETE for wiping the contents of small or large tables when you need no undo information generated.

4: Counting Rows from Tables
 Contrary to popular belief, COUNT(*) is faster than COUNT(1). If the rows are being returned via an index, counting the indexed column – for example, COUNT(EMPNO) is faster still.  Use count(rowid) when there is no indexed column or primary key column.

5: Minimize Table Lookups in a Query

To improve performance, minimize the number of table lookups in queries, particularly if your statements include sub-query SELECTs or multi-column UPDATEs.

For example:
Least Efficient :    
SELECT         TAB_NAME
FROM            TABLES
WHERE TAB_NAME = (SELECT      TAB_NAME
                                          FROM           TAB_COLUMNS
                                          WHERE        VERSION = 604)
AND              DB_VER = (SELECT     DB_VER
                                            FROM         TAB_COLUMNS
                                            WHERE      VERSION = 604)
               
Most Efficient :

SELECT         TAB_NAME
FROM            TABLES
WHERE (TAB_NAME, DB_VER) = (SELECT    TAB_NAME, DB_VER
                                                              FROM       TAB_COLUMNS
                                                             WHERE    VERSION = 604)
Multi-column UPDATE example:

Least Efficient :

UPDATE     EMP
SET              EMP_CAT = (SELECT MAX(CATEGORY)
                                             FROM    EMP_CATEGORIES),
SAL_RANGE = (SELECT MAX(SAL_RANGE)
    FROM    EMP_CATEGORIES )
                           WHERE EMP_DEPT  = 0020;
Most Efficient :

UPDATE      EMP
SET              (EMP_CAT, SAL_RANGE) =
(SELECT     MAX(CATEGORY), MAX(SAL_RANGE)
 FROM         EMP_CATEGORIES)
WHERE       EMP_DEPT = 0020;

6 : Use Table Aliases
Always use table aliases & prefix all column names by their aliases where there is more than one table involved in a query. This will reduce parse time & prevent syntax errors from occurring when ambiguously named columns are added later on.

7: Use EXISTS in Place of DISTINCT

Avoid joins that require the DISTINCT qualifier on the SELECT list when you submit queries used to determine information at the owner end of a one-to-many relationship (e.g. departments that have many employees).

For example:

Least Efficient :

SELECT         DISTINCT DEPT_NO, DEPT_NAME
FROM            DEPT D, EMP E
WHERE          D.DEPT_NO = E.DEPT_NO

Most Efficient :

SELECT         DEPT_NO, DEPT_NAME
FROM            DEPT D
WHERE EXISTS (SELECT       ‘X’
                      FROM         EMP E
                      WHERE       E.DEPT_NO = D.DEPT_NO);

EXISTS is a faster alternative because the RDBMS kernel realizes that when the sub-query has been satisfied once, the query can be terminated.

8 : Avoid Calculations on Indexed Columns

If the indexed column is a part of a function (in the WHERE clause), the optimizer does not use an index and will perform a full-table scan instead.

Note : The SQL functions MIN and MAX are exceptions to this rule and will utilize all available indexes.

For example:
 Least Efficient :

SELECT . . .
FROM            DEPT
WHERE         SAL * 12 > 25000;

Most Efficient :
SELECT . . .
FROM            DEPT
WHERE         SAL > 25000 / 12;

9 : Exact Position of Tables & Columns
Exact Position:
When we run a query, the execution starts from last to first. The table which contains less number of records should be kept last.
                        Ex: Table1(1000 records), Table2(100)
                        Select * from table1, table2.
This will not affect the query cost wise but in long term data retrieval will be faster.

Exact position of columns:
At time of joining 2 tables master record should be right hand side of = Symbol.

 Method 1 (Least Efficient) :

Select * from master, child 
where master.columnname=child.columnname;

 Method 2 (Most Efficient) : 

Select * from master, child 
where  child.columnname=master.columnname;

>>> Here data retrial will be faster

10: Null Comparison
We should try to avoid comparing the NULL values column. Use of this kind of column will not allow the optimizer to use of Indexes present on the table.

Method -1 (Least Efficient) :
 Select rt.requestid
  FROM bookrequests rt
 where (:iv_request_id is NULL OR rt.requestid = :iv_request_id);

Method -2 (Most Efficient) :
if (:iv_request_id is not null) THEN
  Select rt.requesttestid
    FROM bookrequests rt
   where rt.requesttestid = :iv_request_id);
Elsif (:iv_request_id IS NULL) THEN
  Select rt.requesttestid FROM bookrequests rt
where rt.requestid is null);
End If;


11: Compatible datatype of columns:
At time of joining tables a due focus should be on the datatype compatibility of the columns used in Join conditions or where clauses.

Method 1 (Least Efficient) :
Sql> Select barcode, price From products Where barcode = 5467;                                                              
Method 2 (Most Efficient) :    
Sql> Select barcode, price From products Where barcode = ‘5467’;                                                           
Such conditions should not be used at all, even though if its required this may require oracle definded datatype conversion functions like To_char ,To_date etc.


  • Rectification of such kind of errors will result in a lot Cost reduction.
  • If  DataType Conversion is performed , Should be used on INPUT Variables not on Table Columns.
12: Inline views,Replacement of Subquery


  • When performing Left outer or Right Outer join ,try to replace it by using Inline Views for small / master tables/ fact tables.


SELECT e.empno,e.ename,d.deptno,d.dname
FROM (SELECT empno, ename from emp@dblink) e, dept d;
  • If the same sub query is used in multiple locations of a query, it can be replaced by making use of Functions.
  • Functions will avoid great context Switch and also code will be reusable.
Summary of techniques for each type of subquery:

Standard Subquery
Anti-join Subquery

IN
EXISTS
NOT IN
NOT EXISTS
Correlated
subquery
Redundant Boolean predicates. Can always be replaced with a standard join
Automatic Transformation to nested loop join
Rewrite as select distinct outer join
Rewrite as select distinct outer join
Non-correlated subquery
Automatic transformation to nested loop join
Never appropriate
Rewrite as nested loop join with minus operator
Never appropriate

13 : Use of Function for Left join, Subquery


Method 1 (Least Efficient) :
select e.empno, e.ename, t.trandate, t.tranamount
  from transaction t
  inner join emp e on t.createdby = e.empno
order by t.tranamount desc;
                                                                   
Method 2 (Most Efficient) :
select counterno,f_showname(t.createdby) "Ename", t.trandate, t.tranamount
  from transaction t
 order by t.tranamount desc;

Using co-related sub-query:

                                                                  
Method 1 (Least Efficient) :

Select pbn.billno
  from Publicbillno pbn
 where pbn.locationid = iv_locationid
   and pbn.billno not in
       (select pb.billno
          from Publicbill pb
         where pb.locationid = iv_locationid)

Method 2 (Most Efficient) : 
Select pbn.billno
          from Patientbillno pbn
         where pbn.locationid = iv_locationid
           and not exists (select pb.billno
                         from Publicbill pb
                        where pb.locationid = iv_locationid
                          and pb.billno = pbn.billno)

14 : Usuage of terms Exists , DECODE,CASE,LIKE


  • Making use of Clauses like `EXISTS,DECODE,CASE‘ will help to reduce the cost of queries.
  • Use of EXISTS in place of IN. EXISTS will give better performance when the main query returns less records
  • CASE will be more effective than an IF condition.
  • Avoid LIKE clause,  If it is used, it should be more specific to the requirement.
  • CASE can be used in SELECT statements and  Pl/SQL block where more IF-ELSE used.
14.1) Use DECODE to Reduce Processing
 The DECODE statement provides a way to avoid having to scan the same rows repetitively or to join the same table repetitively.

 For example:
 SELECT   COUNT(rowid), SUM(SAL)
FROM  EMP WHERE DEPT_NO = 0020 AND ENAME LIKE ‘SMITH%’;

SELECT  COUNT(rowid), SUM(SAL)
FROM            EMP
WHERE DEPT_NO = 0030
AND              ENAME LIKE ‘SMITH%’;

 You can achieve the same result much more efficiently with DECODE:

 SELECT COUNT(DECODE(DEPT_NO, 0020, ‘X’, NULL)) D0020_COUNT,
                 COUNT(DECODE(DEPT_NO, 0030, ‘X’, NULL)) D0030_COUNT,
                 SUM(DECODE(DEPT_NO, 0020, SAL, NULL)) D0020_SAL,
                 SUM(DECODE(DEPT_NO, 0030, SAL, NULL)) D0030_SAL
                 FROM    EMP
     WHERE  ENAME LIKE ‘SMITH%’;

Similarly, DECODE can be used in GROUP BY or ORDER BY clause effectively.

14.2) Use EXISTS in Place of IN for Base Tables

Many base table queries have to actually join with another table to satisfy a selection criteria. In such cases, the EXISTS (or NOT EXISTS) clause is often a better choice for performance.
     
Least Efficient :

SELECT         *
FROM            EMP              (Base Table)
WHERE         EMPNO > 0
AND               DEPTNO IN (SELECT DEPTNO
                                               FROM   DEPT
                                               WHERE LOC = ‘MELB’)
Most Efficient :

SELECT         *
FROM            EMP
WHERE         EMPNO > 0
AND               EXISTS (SELECT      ‘X’
                                       FROM        DEPT
                                       WHERE     DEPTNO = EMP.DEPTNO
                                       AND          LOC = ‘MELB’)
14.3) Use NOT EXISTS in Place of NOT IN

In sub-query statements such as the following, the NOT IN clause causes an internal sort/merge. The NOT IN clause is the all-time slowest test possible as it forces a full read of the table in the sub-query SELECT. Avoid using NOT IN clause either by replacing it with Outer Joins or with a NOT EXISTS clause as shown below:

SELECT . . .
FROM            EMP
WHERE         DEPT_NO NOT IN (SELECT   DEPT_NO
                                                          FROM    DEPT
                                                          WHERE  DEPT_CAT = ‘A’);

To improve the performance, replace this code with:

Method 1 (Efficient) :

SELECT . . .  
FROM            EMP A, DEPT B
WHERE         A.DEPT_NO = B.DEPT_NO (+)
AND               B.DEPT_NO IS NULL
AND               B.DEPT_CAT(+) = 'A'   

Method 2 (Most Efficient) :

SELECT . . .
FROM            EMP E
WHERE         NOT EXISTS (SELECT          ‘X’
                                                 FROM              DEPT
                                                 WHERE           DEPT_NO = E.DEPT_NO
                                                 AND                 DEPT_CAT = ‘A’);

14.4) Use Of Not Exist and Exist

Method 1 (Least Efficient) :

Select count(rowid)  From Products Where barcode NOT IN (Select barcode From Clothing);
Time in Secs : >500secs

Method 2 (Most Efficient) :


Select count(rowid)  From Products P  Where NOT EXISTS  (  Select C.barcode From Clothing C   Where C.barcode = P.barcode);

Method 3 (Least Efficient) : 


Time in Secs: 6.1 secs
Select prod_id,qty  From Product Where prod_id=167 and item_no IN (select item_no from items)
                                                                                          Time in Secs : >120secs
Method 4 (Most Efficient) :
Select prod_id,qty  From Product  prd Where prod_id=167 and exists  (select ‘x’ from items itm  where b.itemno=a.itemno);
                                                                                   Time in Secs: 2.0 secs

15 : Avoid NOT on Indexed Columns

In general, avoid using NOT when testing indexed columns. The NOT function has the same effect on indexed columns that functions do. When ORACLE encounters a NOT, it will choose not to use the index and will perform a full-table scan instead.

Least Efficient : (Here, index will not be used)

SELECT . . .
FROM            DEPT
WHERE         DEPT_CODE <> 0;

Most Efficient : (Here, index will be used)

SELECT . . .
FROM            DEPT
WHERE         DEPT_CODE > 0;

In a few cases, the ORACLE optimizer will automatically transform NOTs (when they are specified with other operators) to the corresponding functions:
NOT >           to       <=
NOT >=         to       <
NOT <           to       >=
NOT <=         to       >

16: Avoid IS NULL and IS NOT NULL on Indexed Columns

Avoid using any column that contains a null as a part of an index. ORACLE can never use an index to locate rows via a predicate such as IS NULL or IS NOT NULL.

In a single-column index, if the column is null, there is no entry within the index. For concatenated index, if every part of the key is null, no index entry exists. If at least one column of a concatenated index is non-null, an index entry does exist.

17: Using Global Temporary Table (GTT)


  • When ever a huge data is being processed in a procedure its always better to use GTT.
  • It will improve the performance as it will also reduce the overhead of data truncation as in normal table, The procedure would also contain a delete statements or else a table truncation.
  • GTT will be allocated separately for multiple sessions accessing the table.
  • This will also allow us to use concept like BULK COLLECT, COLLECTION

Syntax:
Create global temporary gtt_tablename
(col1  datatype, Col2 datatype, …..
) on commit preserve rows [ On commit delete rows];

Note: On commit preserve rows: All data retain in the table till the session closed but On commit delete rows delete all the rows from the GTT as soon as commit fires even if the session not closed. GTT indexes are auto managed by oracle. No water mark created when number of deletion occurs.

Examples:
Method 1 (Least Efficient) :
SELECT NVL(T.BILLID, T.TRANSACTIONID) AS TRANID   
FROM CRM.MAINTRANSACTION  T,  HR.SHIFTMAINTENANCE SM          
     WHERE T.SHIFTNO = SM.SHIFTNO AND
           TRUNC(TRANDATE) = '19-JUN-2013' AND T.FLAG = 1 

AND   T.TRANEVENT NOT IN (28) AND T.CREATEDBY = 8129
            AND T.Locationid=10201
                                                                                                   Execution Time : 300 sec
Method 2 (Most Efficient) :
INSERT INTO CRM.GT_MAINTRANSCTION SELECT * FROM CRM.MAINTRANSCTION
WHERE TRANDATE='19-JUN-2013'
SELECT NVL(T.BILLID, T.TRANSACTIONID) AS TRANID  FROM CRM.GT_MAINTRANSACTION  T,  HR.SHIFTMAINTENANCE SM          
     WHERE T.SHIFTNO = SM.SHIFTNO AND
           TRUNC(TRANDATE) = '19-JUN-2013' AND T.FLAG = 1 AND
           T.TRANEVENT NOT IN (28) AND T.CREATEDBY = 8129
            AND T.Locationid=10201
                                                                                                         Execution Time : 40 sec 
18 : Use BULK COLLECT and FORALL
When inserting rows in PL/SQL, developers often place the insert statement inside a FOR loop.  To insert 1,000 rows using the FOR loop, there would be 1,000 context switches between PL/SQL and the Oracle library cache.

Oracle allows you to do bulk binds using the PL/SQL FORALL loop, which requires only one context switch.  This is achieved by the FORALL statement passing an entire PL/SQL table to the SQL engine in a single step.  Internal tests show that this process results in substantial performance increases over traditional methods.

Clicker here to read more


Example:
Method 1 (Least Efficient) :
DECLARE
TYPE NumTab IS TABLE OF NUMBER(5) INDEX BY BINARY_INTEGER;
TYPE NameTab IS TABLE OF CHAR(15) INDEX BY BINARY_INTEGER;
    pnums  NumTab;  pnames NameTab;
BEGIN
 FOR i IN 1..20000 LOOP  -- use FOR loop
       INSERT INTO parts VALUES (pnums(i), pnames(i));
 END LOOP;
END;                                                                                     Time taken in sec: 11.0
/

Example:
Method 1 (Most Efficient) :
DECLARE
   TYPE NumTab IS TABLE OF NUMBER(5) INDEX BY BINARY_INTEGER;
   TYPE NameTab IS TABLE OF CHAR(15) INDEX BY BINARY_INTEGER;
pnums  NumTab;
pnames NameTab;
BEGIN
FORALL I in 1 .. 20000 -- use FORALL
       INSERT INTO parts VALUES (pnums(i), pnames(i));
END;                                                                                                     Time Taken in Secs: 0.5 sec
/

Using  Bulk Collect
Example :
Method 1 (Least Efficient) :
Declare
   Type bcode is table of products.barcode%TYPE;
   i int;
   barc bcode;
  cursor cur_seq is
        select barcode from products where rownum<100001;
Begin
        i:=0;
        for cur_dta in cur_seq loop 
         barc:=cur_dta(i).barcode;
         i:=i+1;
        end loop;
End; 
                                                            Execution Time : 20 sec
Method 2 (Most Efficient) :
declare
    Type bcode is table of products.barcode%TYPE;
        i int;
        barc bcode;
begin
select barcode BULK COLLECT into barc from products where rownum<100001;
end;
                                                                                                                                                                           Time taken : 1.41 sec

19 : Use of Index
> Indexes helps a lot in increasing the Performance of a Query. Its like storing the address of a perticular Record.
> We can create Bitmap Index, functional Index, Normal Index based on requirement.
> If we create a Unique index on a column data should be unique  in the column.
> Oracle also provides a new concept of Invisible Index for performance monitoring of a specific query without affecting the other query referencing the Table

20: Using table partition
> Table partition is helpfull when a table contain a huge amount of data.
> It breaks the Table into smaller segments called partition thus resulting into faster data retrival.
> Partition can be Various Type. Range , Hash, List Partition
> For Partitioned table Query writing will be same as for the Normal Tables.
> We can also Go Partioned Index
> After performing the partition managment we need to gather the Statistics
> No Special Query Writing method for Partiton Tables..
CREATE TABLE Emp    ( Emp_no NUMBER,       Emp_sal varchar2(30) )
              PARTITION BY RANGE (Emp_no)
              ( PARTITION emp1 VALUES LESS THAN (10) TABLESPACE tbsa,
               PARTITION emp2 VALUES LESS THAN (20) TABLESPACE tbsb);

Note: Use bigger values for range partition ( in lakh)  in case of production environment.

CREATE TABLE Emp (Emp_no NUMBER,  Emp_sal varchar2(30))
               PARTITION BY HASH (Emp_no)
               PARTITIONS 4  STORE IN (gear1, gear2, gear3, gear4);
CREATE INDEX local_idx_SRC ON BILLING.TESTLALIT (PATIENTSERVICEREQUISTID)  LOCAL;
CREATE INDEX global_part_idx ON  BILLING.TESTLALIT(PATIENTSERVICEREQUISTID)
GLOBAL PARTITION BY RANGE(PATIENTSERVICEREQUISTID)
(PARTITION p1 VALUES LESS THAN(1000),
 PARTITION p4 VALUES LESS THAN(10000));
Note : MAXVALUE value can be used if further you don’t want to add new partition.

Caution: If any partition is filled up and new partition is not added, then your entire transaction will be stopped. So be careful when you are adding new partitions. But in Oracle 11g onwards there is solution to add partitions automatically.

-- add partition manually
Alter table owner.partition_tablename add PARTITION partitionname values less than (value)
e.g.,
sql> alter table hr.employee add PARTITION employee _4 values less than (1000000)

-- add partition automatically / add partition dynamically ( Oracle 11g onwards)
Sql> alter table owner.partition_tablename set interval (100000);
--e.g.,
alter table hr.employee set interval (500000);

Click Here to view more about Table partitioning

Click here to read Top 10 best practices for performance tuning


Do's
 
Ø  Use the cursor FOR loop – Whenever you need to read through every record fetched by a cursor, use the for loop instead of open cursor, fetch & close cursor steps
Ø  Work with records – Always fetch from an explicit cursor into a record declared with %ROWTYPE, as opposed to individual variables.
Ø  Write small programming blocks for all the queries/code which is re-used and call these programs in the procedures / functions instead of re-coding.
Ø  Use the %TYPE or %ROWTYPE declaration attributes to define the data type of those of variables to avoid data type mismatch. {If those database elements change, compiled code is discarded. When recompiled, the changes are automatically applied to the code}.
Ø  Use Bulk Collects, Bulk Fetch if the result for Select Stmt fetches more data
Ø  Use bind variables for comparing known values in the where clause of the queries. For eg. Where status=1 can be reframed as status=:a where a stores 1.
Ø  Coding a few simple queries in place of a single complex query is a better approach, because the individual SQL statements are easier to optimize and maintain.
Ø  Using if-else-end if conditions, write multiple queries instead of single query in which the columns are compared with null or not null values,
Ø  Use UNION ALL instead of UNION if the result set of both the queries are same
Ø  Use ‘For all’ Statements for repeated DML statements with an exception of using it cautiously for update statements
Ø  Use DECODE and CASE - Performing complex aggregations with the “decode” or "case" functions can minimize the number of times a table has to be selected

Don’ts
Ø  Avoid the LIKE predicate – Always replace a "like" with an equality, when appropriate
Ø  Avoid the use of NOT IN or HAVING. Instead, a NOT EXISTS subquery may run faster
Ø  Never mix data types - If a WHERE clause column predicate is numeric, do not to use quotes. For char index columns, always use quotes. There are mixed data type predicates:    eg. Where cust_nbr = “123”, Where substr(ssn,7,4)= 1234
Ø  Avoid IN/NOT IN and use EXISTS/NOT EXISTS
Ø  Do not use “SELECT *” under any circumstances inside application code
Ø  Do not use count(*). Replace count(*) with count(rowid)
Ø  Avoid using unnecessary brackets. Unnecessary brackets increase parsing time and adversely affect query performance
Ø  Avoid usage of != or <> (not equal to) operators, instead use other relational operators like ‘>’ and ‘<’

Sure it will help. Don't forget to post comments. I hope sure we will post.

8 comments:

  1. really a helpful one for the begineers and good points were presented in detail in blog.

    www.erptree.com/course/oracle-fusion-financials-training-in-ameerpet/

    ReplyDelete
  2. Thanks for sharing valuable and informative content. Keep it up.

    We also provide same services such as MySQL database and sql and oracle sql free download etc. if you want to take any related services please visit our official website tosska.com.

    ReplyDelete
  3. Enjoyed reading the article above, really explains everything in detail, the article is very inter sting and effective. Thank you and good luck for the upcoming articles soa Online Training

    ReplyDelete
  4. This comment has been removed by a blog administrator.

    ReplyDelete
  5. This comment has been removed by a blog administrator.

    ReplyDelete

Translate >>