Gouranga's Tech Blog: February 2016

Cloning is a process that can be used to create a copy of the existing Oracle database. Sometimes DBA’s need to clone databases to test backup, development environments and Relocating an database to another machine, and recovery strategies or export a table that was dropped from the production database and import it back into the production database. Cloning can be done on separate hosts or on the same host and is different from standby database. But Oracle suggests do not clone the database on same host with same database name.

Be sure you are doing this type of cloning on same platform with at least same Oracle version.

From Oracle 12c onwards, you can move datafiles online if you need to re-locate datafile.

The following ways Cloning can be done:
1.Cold backup Cloning
2.Hot backup Cloning
3.RMAN Cloning

Cold backup (offline/closed) Cloning:

Let see how to clone with cold (offline) backups from Host Test to Test1.

Cold backup Cloning the database with “CREATE Control file” script let us see the below steps:

Source Database Name: UAT (HostName: Test)
Clone database Name: DEV (HostName: Test1)

Step1:
On target database (jicdb) host Test:
$export ORACLE_SID=UAT
SQL> select name from v$datafile;
SQL> select member from v$logfile;
SQL> select name from v$controlfile;

On target side, create pfile for clone database:

SQL> create pfile=’/u01/backup/dev/initdev.ora’ from spfile;

Make available created pfile (initdev.ora) to clone database side:
On clone side, change the value of "db_name" from “UAT” to “DEV”. Keep in mind that following parameters may also need to change:
---------------------------------------------------------------------------------------------------------
*._keep_remote_column_size=TRUE
*.audit_file_dest='/oracle/app/admin/dev/adump'
*.audit_trail='db'
*.compatible='11.2.0.4.0'
*.control_files='/u02/oracle/oradata/dev/control01.ctl','/u02/oracle/oradata/dev/control02.ctl'
*.db_block_size=8192
*.db_domain=''
*.db_name='dev'
*.diagnostic_dest='/u01/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=devXDB)'
*.log_archive_format='%t_%s_%r.dbf'
*.nls_date_format='DD-MON-RRRR HH24:MI:SS'
*.open_cursors=300
*.pga_aggregate_target=1G
*.processes=600
*.remote_login_passwordfile='EXCLUSIVE'
*.sec_case_sensitive_logon=FALSE
*.sessions=665
*.sga_target=6G
*.undo_tablespace='UNDOTBS1'

---------------------------------------------------------------------------------------------------------
Create the script that will re-create the control file:

SQL> alter database backup controlfile to trace as '/u01/control_preprd.trc';

Note:
The above statement will put a text copy of the control file in the /u01 directory.

After finding the correct trace file, rename it to PREPROD_cr8control.sql and edit it as follows:
Remove everything up to the "START NOMOUNT" statement and everything after the semicolon at the end of the "CREATE CONTROLFILE" statement.

Edit the line starting with "CREATE CONTROLFILE" and replace the word "REUSE" with the word "SET" right before the keyword DATABASE.

On the same line, modify the database name changing it from UAT to PREPROD.
On the same line, change the keyword NORESETLOGS to RESETLOGS.

If the clone (PREPROD) database files directory locations are different with target database file location, change the paths for data files and redo logs in this script.

Ex:
My source database location: “/u02/oracle/oradata/UAT/”
Clone/target database location: “/u02/oracle/oradata/dev/”

Script should now read:
Edited file dev_cr8control.sql
---------------------------------------------------------------------------
CREATE CONTROLFILE SET DATABASE "dev" RESETLOGS NOARCHIVELOG
MAXLOGFILES 16
MAXLOGMEMBERS 3
MAXDATAFILES 100
MAXINSTANCES 8
MAXLOGHISTORY 1168
LOGFILE
GROUP 1 '/u02/oracle/oradata/dev/redo01.log' SIZE 50M BLOCKSIZE 512,
GROUP 2 '/u02/oracle/oradata/dev/redo02.log' SIZE 50M BLOCKSIZE 512,
GROUP 3 '/u02/oracle/oradata/dev/redo03.log' SIZE 50M BLOCKSIZE 512
-- STANDBY LOGFILE
DATAFILE
'/u02/oracle/oradata/dev/system01.dbf',
'/u02/oracle/oradata/dev/sysaux01.dbf',
'/u02/oracle/oradata/dev/undotbs01.dbf',
'/u02/oracle/oradata/dev/users01.dbf',
'/u02/oracle/oradata/dev/HR01.dbf',
'/u02/oracle/oradata/dev/payroll01.dbf'
CHARACTER SET AL32UTF8
;

Note: As you are changing database name, the "SET" is used to create control file. "REUSE" used when to keep same database name but relocate C/R/D files etc.
---------------------------------------------------------------------------------------------------------

Step2:
On source database(UAT) Host Test:

Shutdown the database “UAT” and backup copy all datafiles only (do not copy redo logs, control files) with “CP” command at operating system level, why we are not coping redo,

control files because the redo log files and control files are automatically created by oracle when the re-create control file script was run on clone database (dev).

Let see the following procedure:
SQL>SHUTDOWN

$cp *.dbf /u01/bkp/UAT

Transfer all copied datafiles to clone database Host Test1:
$scp *.dbf oracle@Test1:/u02/oracle/oradata/dev

Step3:
On clone database (dev) Host Test1:
$export ORACLE_SID=dev
SQL> STARTUP NOMOUNT PFILE=’/u01/app/oracle/db_1/dbs/initdev.ora’
Started..

Step4:
Run the re-created control file text on dev clone database:
SQL> @dev_cr8control.sql
Control file created…

Step5:
Now try to open the database "dev" with resetlog option:
SQL> alter database open resetlogs;
Database altered.
The database "dev" successfully opened.

Note:
Here we have to keep in mind, we just copied only all data files (.dbf) to clone side host Test1, and the redo logs, control file are automatically created by oracle, when

“dev_cr8control.ctl” re-create control file script was run.

Step6:
Now check the cloned database "dev":

SQL> select name from v$database;
NAME
---------
DEV
SQL> select name from v$datafile;
NAME
----------------------------------------------------------------
/u02/oracle/oradata/dev/system01.dbf
/u02/oracle/oradata/dev/undotbs01.dbf
/u02/oracle/oradata/dev/sysaux01.dbf
/u02/oracle/oradata/dev/users01.dbf
SQL> select member from v$logfile;
MEMBER
------------------------------------------------------------------
/u02/oracle/oradata/dev/redo03.log
/u02/oracle/oradata/dev/redo01.log
/u02/oracle/oradata/dev/redo02.log
/u02/oracle/oradata/dev/redo03.log

SQL> select name from v$controlfile;

NAME
-------------------------------------------------------------------
/u02/oracle/oradata/dev/control01.ctl
/u02/oracle/oradata/dev/control02.ctl

Thanks !

Issues may come:

SQL> @create_ctl.sql
CREATE CONTROLFILE SET DATABASE "dev" RESETLOGS NOARCHIVELOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-01130: database file version 11.2.0.4.0 incompatible with ORACLE version
11.2.0.0.0
ORA-01110: data file 1: '/u02/oracle/oradata/dev/system01.dbf'

SQL>

Fix:

Check .compatible parameter in your parameter file.
In my case I did like below:

*.compatible='11.2.0.0.0'

But my rdbms version is 11.2.0.4. So I changed this to :

*.compatible='11.2.0.4.0'

Then It I re-created the file and worked file.

We will discuss the following points in this article:

1) What is clustering factor?
2) How the clustering factor is being calculated?
3) How to interpret clustering factor?
4) Bad Clustering factor Vs Good Clustering factor
5) Demonstration
6) FAQ

What is clustering factor?
The clustering factor is a number which represent the degree to which data is randomly distributed in a table. In simple terms it is the number of “block switches” while reading a table using an index. Alternatively, The clustering factor is a measure of the ordered-ness of an index in comparison to the table that it is based on. It is used to check the cost of a table lookup following an index access (multiplying the clustering factor by index’s selectivity gives you the cost of the operation).

Note: A table with a high clustering factor is out-of-sequence with the rows and large index range scans will consume lots of I/O.

How the clustering factor is being calculated?
To calculate the clustering factor of an index during the gathering of index statistics, Oracle does the following:
- For each entry in the index Oracle compares the entry's table rowid block with the block of the previous index entry. If the block is different, Oracle increments the clustering factor by 1.
- The minimum possible clustering factor is equal to the number of blocks identified through the index's list of rowid's -- for an index on a column or columns containing no nulls, this will be equal to the number of blocks in the table that contain data. The maximum clustering factor is the number of entries in the index.

How to interpret clustering factor?
So this means that Oracle now has a statistic to allow it to estimate how many table blocks would be associated with an index range scan.

If the clustering factor is close to the number of entries in the index, then an index range scan of 1000 index entries may require nearly 1000 blocks to be read from the table.

If the clustering factor is close to the number of blocks in the table, then an index range scan of 1000 index entries may require only 50 blocks to be read from the table.

This can be compared with the cost of reading the entire table up to the high-water mark (using the more efficient multiblock i/o mechanism) to determine whether a full table scan or an index range scan offers the most efficient access mechanism.

Note that where extensive deletes have occurred from the table there may be blocks with no rows in. These will be accounted for in the clustering factor because those blocks will not appear in the index's rowid list. The full table scan will still read all table blocks up to the high water mark, regardless of whether they contain rows or not. So in an extreme case it is possible that Oracle could see from the index and table statistics that although a table has 1,000,000 blocks below the high water mark, reading 100% of the rows in the table might only require reading 10 of those blocks. Providing that "Not Null" constraints tell Oracle that all table rows are present in the index, a query such as "select * from big_table_with_few_rows" might be more efficiently satisfied with an index range scan than with a full table scan.

Note also that calculating the clustering factor is done without reference to the table at all -- it is based solely on information contained in the index.

Bad Clustering factor Vs Good Clustering factor

Fig-1 : Bad CF

The above diagram explains that how scatter the rows of the table are. The first index entry (from left of index) points to the first data block and second index entry points to second data block. So while making index range scan or full index scan, optimizer have to switch between blocks and have to revisit the same block more than once because rows are scatter. So the number of times optimizer will make these switches is actually termed as “Clustering factor”.

Fig-2 : Good CF

The above image represents "Good CF”. In an event of index range scan, optimizer will not have to jump to next data block as most of the index entries points to same data block. This helps significantly in reducing the cost of your SELECT statements.

Clustering factor is stored in data dictionary and can be viewed from dba_indexes (or user_indexes)

Click here to read about Selectivity, Clustering, and Histograms (Burleson Consulting)

Demonstration:

SQL> create table CF_TEST as select * from all_objects;
Table created.

SQL> create index obj_id_indx on CF_TEST(object_id);
Index created.

SQL> select clustering_factor from user_indexes where index_name='OBJ_ID_INDX';

CLUSTERING_FACTOR
-----------------
2165
SQL>
SQL> SQL> select count(*) from CF_TEST;

COUNT(*)
----------
103090

SQL> select blocks from user_segments where segment_name='OBJ_ID_INDX';

BLOCKS
----------
256

The above example shows that index has to jump 2165 times to give you the full data had you performed full table scan using the index.

Note:
- A good CF is equal (or near) to the values of number of blocks of table.
- A bad CF is equal (or near) to the number of rows of table.

Myth:
Rebuilding of index can improve the CF.

Tip:
The clustering of data within the table can be used to improve the performance of statements that perform range scan–type operations. By determining how the column is being used in the statements, indexing these column(s) may provide a great benefit.

The clustering factor records the number of blocks that will be read when scanning the index. If the index being used has a large clustering factor, then more table data blocks have to be visited to get the rows in each index block (because adjacent rows are in different blocks). If the clustering factor is close to the number of blocks in the table, then the index is well ordered, but if the clustering factor is close to the number of rows in the table, then the index is not well ordered. The clustering factor is computed by the following (explained briefly):

The index is scanned in order.
The block portion of the ROWID pointed at by the current indexed valued is compared to the previous indexed value (comparing adjacent rows in the index).
If the ROWIDs point to different TABLE blocks, the clustering factor is incremented (this is done for the entire index).

The CLUSTERING_FACTOR column in the USER_INDEXES view gives an indication as to how organized the data is compared to the indexed columns. If the value of the CLUSTERING_FACTOR column value is close to the number of leaf blocks in the index, the data is well ordered in the table. If the value is not close to the number of leaf blocks in the index, then the data in the table is not well ordered. The leaf blocks of an index store the indexed values as well as the ROWIDs to which they point.

For example, say the CUSTOMER_ID for the CUSTOMERS table is generated from a sequence generator, and the CUSTOMER_ID is the primary key on the table. The index on CUSTOMER_ID would have a clustering factor very close to the number of leaf blocks (well ordered). As the customers are added to the database, they are stored sequentially in the table in the same way the sequence numbers are issued from the sequence generator (well ordered). An index on the CUSTOMER_NAME column would have a very high clustering factor, however, because the arrangement of the customer names is random throughout the table.

The clustering factor can impact SQL statements that perform range scans. With a low clustering factor (relative to the number of leaf blocks), the number of blocks needed to satisfy the query is reduced. This increases the possibility that the data blocks are already in memory. A high clustering factor relative to the number of leaf blocks may increase the number of data blocks required to satisfy a range query based on the indexed column.

Note: Using the "alter table xxx shrink space" command will change the clustering_factor for the primary index on the table. This, in turn, could cause dynamic statistics to generate different SQL execution plans. Hence, you should rebuild or coalesce your indexes whenever you issue the alter table shrink space command. This will ensure that no SQL changes plans.

FAQ:
1) The clustering_facotr column in the user_indexes view is a measure of how organized the data is compared to the indexed column, is there any way i can improve clustering factor of a index. or how to improve it?

Ans:
Already we discussed. Again trying to elaborate.

It tells us how ordered the rows are in the index. If CLUSTERING_FACTOR approaches the number of blocks in the table, the rows are ordered. If it approaches the number of rows in the table, the rows are randomly ordered. In such a case (clustering factor near the number of rows), it is unlikely that index entries in the same leaf block will point to rows in the same data blocks.

Note that typically only 1 index per table will be heavily clustered (if any). It would be extremely unlikely for 2 indexes to be very clustered. If you want an index to be very clustered -- consider using index organized tables. They force the rows into a specific physical location based on their index entry.

Otherwise, a rebuild of the table is the only way to get it clustered (but you really don't want to get into that habit for what will typically be of marginal overall improvement).

2) What is well order of Index as part of clustering factor?
Ans:
In general, if all of the index entries in a given leaf block point to the same block, then
the table is well ordered with regards to this index.

If all of the index entries in a given leaf block point to different blocks in the table , then
the table is not well ordered with respect to this index.

Gouranga's Tech Blog

Pages

Feb 16, 2016

Cloning with Cold Backup (offline/closed)

Feb 5, 2016

Clustering Factor - a key metrics for SQL Tuing in Oracle 11g

Translate >>

Pages