Dec 2, 2015

Oracle PL/SQL programming common mistakes

Points covered:

1) Formatting data in views
2) Hardcoding local Varchar2 variable size
3) Ignoring exceptions
4) Not using bound variables for changing parameters
5) Storing empty LOBs
6) Too many levels of views
7) Transactional control in non-autonomous procedures
8) Using Sequence nextval without curval
9) Using bound variables for constants
10) Using derived column values for existence checks
11) Using non-dedicated packages for continuous jobs
12) Wrapping everything into stored procedures
13) Use of Truncate for Global Temporary Table(GTT)
14) Using non-deterministic functions directly in conditions
15) Catch-all error handling

I obsrved lots of time in pl/sql code, there are some common mistakes while writing codes for various environments. If codes are written for OLTP applications, these kind of mist be avoided. Here are my descriptions for all the above points with root cause analysis and solution.

1) Formatting data in views

Severity: Makes system harder to use

Symptoms:
• to_char used on date or numeric values in views
• strings concatenated in view code to form pretty printed values

Why is this bad?
• data format cannot be easily modified on the front-end, since some information may be lost
• values cannot be easily modified (i.e. applying time zone shifting becomes much harder)
• filtering based on underlying date or string values becomes much more processor-heavy and requires full table scans and/or substring matching/comparisons.
• internationalisation becomes much harder — instead of translating elements and then combining them, translation engines must analyse and translate/reformat formatted data

Solutions:
• Format data on the front-end, not in the database.
• Perform formatting in queries coming from the front end, specifying exactly what the front end needs
— but database views should not suppose any specific data format.


2) Hardcoding local Varchar2 variable size

Severity: Makes system harder to use

Symptoms
• PL/SQL function or procedure declares local Varchar2 variables for temporary storage of table values, with hard-coded length
• Views declared with Varchar2 types with hard-coded length

Why is this bad?
• Code is error prone, because hard-coded values may not allow for enough space to store the entire value coming from a database table.
• Even if the size is correct, if the underlying type ever changes, errors such as ORA-06502 ‘Character string buffer too small’ may start appearing in procedures.

Solution:

use %TYPE to reference the underlying column type instead of hard-coding the type and size for local variables.

Exceptions
• variables and fields not related to underlying table data
• fields or variables that combine several table fields

3) Ignoring exceptions

Severity: Risk of data corruption

Symptom

This is a typical example:

begin
...
Exception When others then
 NULL;
end;

This kind of code is written when errors such as attempts to insert a duplicate record or modify a nonexisting row should not affect the transaction. It is also common in triggers that must be allowed to fail without effecting the operation which caused them (best-effort synchronisation with an external resource) Less frequently, this code is written by junior developers who do not knowing what to do in case of an error, so they just disregard exceptions.

Why is this bad?

Serious errors such as storage problems or table mutations might be hidden from the calling code

Solutions
• If you do not want any errors to affect current transaction, execute the code in an autonomous transaction and log errors to an error table/log table. For critical functions implement some sort of administrative notifications for those errors. For low priority functions, check the log table to periodically for errors.
• If you want to ignore certain exceptions, because they can be solved by re-processing, handle only those specific exceptions.

Exception

low-risk functions where any errors can safely be ignored

4) Not using bound variables for changing parameters

Severity: Reduced performance

Symptom
Frequently executing the same query with different parameter values, but specifying parameter values literally, without using bound variables.

Why is this bad?
• Database engine will have to compile the query every time and will not be able to cache the statement.
• If care is not taken to prevent SQL injection, may open a security hole in the system.

Solution
For all parameters that are genuinely changing, use a bound variable instead of specifying the value literally.

Exceptions
• Ad-hoc queries that are run only once or infrequently
• Parameters where statement caching is pointless for different values

5) Storing empty LOBs

Severity: Reduced performance

Symptom
Empty CLOB values used instead of NULL for CLOB fields that do not hold a value

Why is this bad?
Oracle allocates space for EMPTY CLOBs. In tables with large number of empty CLOB columns, this can take up significant storage space.

Solution
Use NULL instead of EMPTY CLOB

6) Too many levels of views

Severity: Reduced performance

Symptom
A large hierarchy of views containing sub-views or subqueries is in place. Such hierarchy is usually established as several layers of abstraction over abstraction, typically when adding new core features to underlying models, but keeping client API for backward compatibility using a set of higher-level views.

Why is this bad?
Optimiser will give up and run full table scans even if indexes could be used after typically 8 or 9 levels of nesting.

Solutions
• Flatten the structure so that it has less than 8 levels. Use joins instead of subqueries where possible
• Use materialised views to cut off a part of the hierarchy
• If materialised views cannot be used for performance reasons, use an aggregated table maintained by triggers to do the same.

7) Transactional control in non-autonomous procedures

Severity: Risk of data corruption

Symptom
Commit or rollback statement in a stored procedure or function without PRAGMA AUTONOMOUS TRANSACTION

Why is this bad?
Effectively prevents stored procedures from being used in a wider context — rolling back or committing inside a stored procedure will delete/permanently write data that was used in a wider transaction, in the middle of that transaction.

May cause issues that are very hard to trace/debug - you will not be able to check if data was processed correctly when the procedure rolled back. audit logs will contain references to records which, from the logical point of view, never existed.

May cause inconsistent data — since rolling back/committing will split a wider logical transaction into two — one which rolled back and another one which is running, relational constraints might fail in the secondtransaction. even worse, if the relational constraint checks were not enforced, inconsistent data might be written permanently.

Solutions
• Throw exceptions in case of errors; let the caller decide what to do in case of error. Do noting in case the operation succeeded — let the caller decide if the entire wider transaction is correct or not.
• Add PRAGMA AUTONOMOUS_TRANSACTION; to the procedure header to make it run as an autonomous transaction

Exceptions
• Long-running worker procedures such as batch updates (may include suboperations and save-points to store partial results. should be marked as autonomous transaction).
• Auditing (should be done with autonomous transactions)

8) Using Sequence nextval without curval

Severity: Risk of data corruption

Symptoms
• Sequence currval method used in a procedure or trigger without calling nextval first — typically in a trigger that updates a log record, or a procedure that partially processes data
• sequence currval used to calculate the next value which will be read by nextval

Why is this bad?
• Calling the method will cause an exception if the sequence is not initialised — so the method/trigger
depends on the caller to initialise the sequence first
• Procedures relying on someone else to initialise the sequence must be called in a specific context, which limits their reuse
• Triggers may or may not work depending on the order of execution
• If procedure/trigger uses the current sequence value to update the relevant record, calling it in a different context/order of execution may update the wrong record
• Sequences are not transactional — they can be cached or changed in another session between calls to curval and nextval. currval+1 is not guaranteed to be the next value; using that to predict IDs is very dangerous, as it can lead to wrong records being deleted or updated.

Solutions
• Do not use currval to read contextual information. Pass IDs explicitly to procedures or use sys_context to store contextual information
• Use nextval instead of currval where appropriate (if you just need a unique number)

9) Using bound variables for constants

Severity: Reduced performance

Symptoms
• Bound variables used in queries for values that are never changing (often when client developers bind all variables in a query).
• Bound variables used for parameters where actual value can significantly effect the optimisation plan Why is this bad?
• Optimiser will not use the most efficient plan to execute the query
• If variable peeking is turned on, it might actually chose a completely wrong execution plan for subsequent runs

Solution
For values that are not changing in a query, or where statement caching does not make sense, use a literal and not a bound variable (this does not imply that other genuine parameters in the same query should not be bound).

10) Using derived column values for existence checks

Severity: Reduced performance

Symptom
count, sum or some other aggregate function (i.e. custom made csv) executed in a subquery, and then
compared to 0 or 1 or NULL in the WHERE clause of the outer query in order to check whether at least one (or exactly one) related element in the subsctructure exists; results of the aggregate function are then discarded (not passed on through the outer view).

The subquery was typically developed first, possibly as a full-fledged view, and then the outer query/view, which filters results of the first view; was written. Less frequently they are in the same view, first one used as a subquery.

CREATE OR REPLACE instrument_v AS
SELECT instr.idinstr, instr.name, count(opt.type) AS activeoptiontypes
FROM instrument instr, instrumentoption opt WHERE instr.ididinstr=opt.idinstr (+)
GROUP BY instr.idinstr, instr.name;
...
SELECT idinstr, name FROM instrument_v WHERE activeoptiontypes >0

Why is this bad?

Instead of just checking indexes, potentially a large number of records will have to be read and analysed.

Solution
Instead of comparing the results of aggregate function, create a different view that checks directly for existence of the related type — optimiser will be able to execute the plan using indexes and will not have to calculate aggregate functions for irrelevant rows


11) Using non-dedicated packages for continuous jobs

Severity: Makes system harder to use

Symptoms
• A continuous or frequent database job executes a long-running procedure from a pl/sql package. That same package is also used for client access or by other schemas.
• The body for a continuous or frequent database job is a procedure from such a package.

Why is this bad?
The package will be locked while the procedure is running. With a continuous or frequently running job, this may require interrupting the job or causing down-time even for the smallest changes like granting execution to other users/schemas to the package.

Solution
Extract procedures for job bodies into dedicated pl/sql packages or into standalone procedures.

12) Wrapping everything into stored procedures

Severity: Makes system harder to use

Symptoms

• Client access layer consists exclusively or mostly of stored procedures - both for writing and for reading. Ref Cursor output parameters of stored procedures used to read data.
This is typical for SQLServer programmers moving to Oracle, due to earlier SQL Server limitations in terms of access privileges and functionality. It is also often done under a false pretext that using stored procedures is more secure or brings better performance. In fact, the same security restrictions can be applied to views as to stored procedures in Oracle. If bound variables and statement caching are used, query access to views also brings pre-compilation benefits, so there is no performance gain.

• Stored procedures are used to fill in additional details when inserting or updating data (automatically calculated columns, timestamps and similar).

Why is this bad?

• Procedural access limits operations to a choice of parameters - i.e. deleting records with specific name. You end up by providing a stored procedure for every possible combination of parameters, or by providing “generic” stored procedures that are no better than allowing direct table access.

• Procedural access is typically done on row-level (procedures work with a single row), which has significant performance penalties over tabular access for operations that work on multiple rows.
• Output Ref Cursor parameters do not allow clients to apply further filters or conditions
• Output Ref Cursor cannot be easily joined with additional data
• Output Ref Cursors data cannot be easily paged on demand
• Packaging data-manipulation steps in a stored procedure does not prevent someone from using the
table differently and inserting or modifying data directly. Your rules will only be applied if the clients
are forced to use your stored procedures. As the code base grows, and different programmers join and
leave the team, this will be harder and harder to enforce.

Solutions

• Use views for reading data, do not use stored procedures; instead of passing filter parameters to stored procedures, expose those values as view columns and allow clients to filter using Where. Apply appropriate security restrictions to views.

• Use views for inserting and updating rows if the operation requires dynamic parameters.
• If all required columns in views are not updatable, consider using instead-of triggers to provide the
functionality
• Use triggers to perform action such as populating missing columns on data modification on tables. This ensures that the data will be populated regardless of how the table is used.
• Use stored procedures to encapsulate business rules and procedural processing logic; If front-end requires data that’s calculated from database column values, rather than stored in them, encapsulate that logic in functions and include functions in views. Procedures/functions should be used to extract common procedural steps from triggers and to simplify triggers and jobs.

Exceptions

• Output REF CURSOR objects should be used when filtering view columns for read-only access severely effects performance — for example when subqueries have aggregate functions and filtering outer view would cause query data to be executed on all data, and then filtering it. In this case, inserting specific values into the subquery could significantly reduce execution time.
• This technique can also be used to provide an uniform API for related reports, where query depends on parameter values or supplied parameters

13) Use of Truncate for Global Temporary Table(GTT).

Severity: Reduced performance

Symptom

Database some times dam slow or hung.

Why is this bad?

As a bad practice, developers used to write "delete" statement without any where cluase. It will reduce performance and so many other side effects. To over come this situation developer started writting "TRUNCATE". This is bettter use but more use of truncate in application will create bad impact on performance also. Based on scope it is better to define the table as GTT and use it. But still developers are used to write TRUNCATE for GTTs. GTTs are auto truncate when session will expire.

Solution

When you writting TRUNCATE for general table use below syntax:

TRUNCATE TABLE [schema_name.]table_name
  [ PRESERVE MATERIALIZED VIEW LOG | PURGE MATERIALIZED VIEW LOG ]
  [ DROP STORAGE | REUSE STORAGE ] ;

e.g.,  TRUNCATE scott.emp reuse storage;


When you are using GTT, then dont use TRUNCATE.


Note :
DROP STORAGE Specify DROP STORAGE to deallocate all space from the deleted rows from the table or cluster except the space allocated by the MINEXTENTS parameter of the table or cluster. This space can subsequently be used by other objects in the tablespace. Oracle Database also sets the NEXT storage parameter to the size of the last extent removed from the segment in the truncation process. This is the default.
REUSE STORAGE Specify REUSE STORAGE to retain the space from the deleted rows allocated to the table or cluster. Storage values are not reset to the values when the table or cluster was created. This space can subsequently be used only by new data in the table or cluster resulting from insert or update operations. This clause leaves storage parameters at their current settings.
If you have specified more than one free list for the object you are truncating, then the REUSE STORAGE clause also removes any mapping of free lists to instances and resets the high-water mark to the beginning of the first extent.
14) Using non-deterministic functions directly in conditions

Severity: Reduced performance

Symptom
Non-deterministic function that does not depend on current row values (or depends on a subset of row values) is used in Where clause. Examples might be functions that fetch data from a referential table depending on current database context, or perform some calculations on derived data; Functions are not enclosed in select from dual or a subquery.

Why is this bad?
Functions may be executed much more times than required. It might be sufficient to execute the function just once per query, but Oracle might execute the function for each row of the result (or even worse, for each row of the source).

Solutions
• Turn function into (select function() from dual) if it does not depend on any row values
• Move function into a similar subquery if it depends on referential data, and join the subquery in the
main query
• Mark function as deterministic if it is by nature deterministic (i.e. always returns same result for same parameter values)
• Use Oracle 11 result caching if possible

15) Catch-all error handling

Severity: Risk of data corruption

Symptom
A catch-all exception block (WHEN OTHERS THEN) used to process an error (or a group of errors) in a PL/SQL procedure. This is done typically either to ignore an expected error which should not affect a transaction (for example, when duplicate key in index should be disregarded, or if a problem with inserting should be quietly logged). Catch-all error handling might also be written when a single block of code is expected to throw several types of exceptions (no data found, duplicate value on index, validation…) and the developer wants to handle them all at once.

Another example is using WHEN OTHERS block to catch domain exceptions thrown by RAISE ERROR because they cannot be individually checked.

Why is this bad?
• Catch-all block will prevent other unexpected errors from propagating. For example, a mutating table error caused by a trigger may be hidden with a catch-all block, completing the transaction successfully when it did not really do everything it needed.
• There is often an assumption about the error that can occur, which may be incorrect when a different type of exception is thrown. This may lead to inconsistent data. For example, if a catch-all block is used to insert a record in case of a missing data error, then that record may be inserted on any other error as well.

Solutions
• Do not use WHEN OTHERS to check for a particular exception such as NO_DATA_FOUND. Check directly for a particular type of exception.
• To handle domain-specific errors, declare your domain exceptions so that you can check for a particular exception later in the code.
• Do not assume that you know all errors that may occur at a given place in the code. Storage problems, mutating tables and similar issues should surface to the client, and not be handled and discarded in an unrelated piece of the code.

Exception
Shielding parts of the system from errors in other parts

will continue...in next publish

1 comment:

  1. This comment has been removed by a blog administrator.

    ReplyDelete

Translate >>