Jan 7, 2016

Backup and Restore Oracle Cluster Registry in Oracle 11gR2

Voting Disk and OCR in 11gR2: 
In Oracle Database 11gR2 RAC , we can handle the important Clusterware components Voting Disk and Oracle Cluster Registry (OCR):  Can now store the two inside of an Automatic Storage Management (ASM) Disk Group, which was not possible in 10g.

The OCR is striped and mirrored (if we have a redundancy other than external), similar as ordinary Database Files are. So we can leverage the mirroring capabilities of ASM to mirror the OCR also, without having to use multiple RAW devices for that purpose only. The Voting Disk (orVoting File, as it is now also referred to) is not striped but put as a whole on ASM Disks – if we use a redundancy of normal on the Diskgroup, 3 Voting Files are placed, each on one ASM Disk into a different failgroup. Therefore, you need to have at least 3 fail-groups (most preferably) for that diskgroup.

Backing Up Oracle Cluster Registry:

This section describes how to back up OCR content and use it for recovery. The first method uses automatically generated OCR copies and the second method enables you to issue a backup command manually:

Automatic backups: Oracle Clusterware automatically creates OCR backups every four hours. At any one time, Oracle Database always retains the last three backup copies of OCR. The CRSD process that creates the backups also creates and retains an OCR backup for each full day and at the end of each week. You cannot customize the backup frequencies or the number of files that Oracle Database retains.

Manual backups: Run the ocrconfig -manualbackup command on a node where the Oracle Clusterware stack is up and running to force Oracle Clusterware to perform a backup of OCR at any time, rather than wait for the automatic backup. You must run the command as a user with administrative privileges. The -manualbackup option is especially useful when you want to obtain a binary backup on demand, such as before you make changes to OCR. The OLR only supports manual backups.

When the clusterware stack is down on all nodes in the cluster, the backups that are listed by the ocrconfig -showbackup command may differ from node to node.

Note:
After you install or upgrade Oracle Clusterware on a node, or add a node to the cluster, when the root.sh script finishes, it backs up OLR.
Listing Backup Files

Run the following command to list the backup files:
$ ocrconfig -showbackup

Example:

$ cd /u01/app/11.2.0/grid/bin
$ ocrconfig -showbackup

racdb01     2016/01/07 08:42:51     /u01/app/11.2.0/grid/cdata/dbprdscan/backup00.ocr
racdb01     2016/01/07 04:42:50     /u01/app/11.2.0/grid/cdata/dbprdscan/backup01.ocr
racdb01     2016/01/07 00:42:50     /u01/app/11.2.0/grid/cdata/dbprdscan/backup02.ocr
racdb01     2016/01/06 08:42:46     /u01/app/11.2.0/grid/cdata/dbprdscan/day.ocr
racdb01     2015/12/26 04:33:23     /u01/app/11.2.0/grid/cdata/dbprdscan/week.ocr

PROT-25: Manual backups for the Oracle Cluster Registry are not available
$

The ocrconfig -showbackup command displays the backup location, timestamp, and the originating node name of the backup files that Oracle Clusterware creates. By default, the -showbackup option displays information for both automatic and manual backups but you can include the auto or manual flag to display only the automatic backup information or only the manual backup information, respectively.

Run the following command to inspect the contents and verify the integrity of the backup file:
ocrdump -backupfile backup_file_name

Example:
# cd /u01/app/11.2.0/grid/cdata/dbprdscan
# ocrdump -backupfile backup00.ocr
PROT-303: Dump file already exists [OCRDUMPFILE]

You can use any backup software to copy the automatically generated backup files at least once daily to a different device from where the primary OCR resides.

The default location for generating backups on Linux or UNIX systems is Grid_home/cdata/cluster_name, where cluster_name is the name of your cluster. 

Oracle recommends that you include the backup file created with the OCRCONFIG utility as part of your operating system backup using standard operating system or third-party tools.

To Change backup location:

You can use the ocrconfig -backuploc option to change the location where OCR creates backups. 

Note:
On Linux and UNIX systems, you must be root user to run most but not all of the ocrconfig command options. 

Restoring Oracle Cluster Registry:

If a resource fails, then before attempting to restore OCR, restart the resource. As a definitive verification that OCR failed, run ocrcheck and if the command returns a failure message, then both the primary OCR and the OCR mirror have failed. Attempt to correct the problem using the OCR restoration procedure for your platform.

Notes:
You cannot restore your configuration from an OCR backup file using the -import option, which is explained in Oracle document. You must instead use the -restore option, as described in the following sections.

If you store OCR on an Oracle ASM disk group and the disk group is not available, then you must recover and mount the Oracle ASM disk group.

Restoring the Oracle Cluster Registry on Linux or UNIX Systems:

If you are storing OCR on an Oracle ASM disk group, and that disk group is corrupt, then you must restore the Oracle ASM disk group using Oracle ASM utilities, and then mount the disk group again before recovering OCR. Recover OCR by running the ocrconfig -restore command, as instructed in the following procedure.

Note:
If the original OCR location does not exist, then you must create an empty (0 byte) OCR location with the same name as the original OCR location before you run the ocrconfig -restore command.

Use the following procedure to restore OCR on Linux or UNIX systems:
1. List the nodes in your cluster by running the following command on one node:
$ olsnodes
Example :( from grid home)
$ olsnodes
racdb01
racdb02
$

2. Stop Oracle Clusterware by running the following command as root on all of the nodes:
# crsctl stop crs
If the preceding command returns any error due to OCR corruption, stop Oracle Clusterware by running the following command as root on all of the nodes:
# crsctl stop crs -f

3. If you are restoring OCR to a cluster file system or network file system, then run the following command as root to restore OCR with an OCR backup that you can identify in "Listing Backup Files":
# ocrconfig -restore file_name
After you complete this step, proceed to step 10.

4. Start the Oracle Clusterware stack on one node in exclusive mode by running the following command as root:
# crsctl start crs -excl -nocrs
The -nocrs option ensures that the crsd process and OCR do not start with the rest of the Oracle Clusterware stack.
Ignore any errors that display.
Check whether crsd is running. If it is, then stop it by running the following command as root:
# crsctl stop resource ora.crsd -init

Caution:
Do not use the -init flag with any other command.

5. If you want to restore OCR to an Oracle ASM disk group, then you must first create a disk group using SQL*Plus that has the same name as the disk group you want to restore and mount it on the local node.

If you cannot mount the disk group locally, then run the following SQL*Plus command:

SQL> drop diskgroup disk_group_name force including contents;

Optionally, if you want to restore OCR to a raw device, then you must run the ocrconfig -repair -replace command as root, assuming that you have all the necessary permissions on all nodes to do so and that OCR was not previously on Oracle ASM.

6. Restore OCR with an OCR backup that you can identify in "Listing Backup Files" by running the following command as root:

# ocrconfig -restore file_name

Notes:
Ensure that the OCR devices that you specify in the OCR configuration exist and that these OCR devices are valid.
If you configured OCR in an Oracle ASM disk group, then ensure that the Oracle ASM disk group exists and is mounted.

7. Verify the integrity of OCR:
# ocrcheck
Example:
$ ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       3484
         Available space (kbytes) :     258636
         ID                       :  247691592
         Device/File Name         :     +OCRVD
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check bypassed due to non-privileged user



8. Stop Oracle Clusterware on the node where it is running in exclusive mode:

# crsctl stop crs -f

9. Run the ocrconfig -repair -replace command as root on all the nodes in the cluster where you did not the ocrconfig -restore command. For example, if you ran the ocrconfig -restore command on node 1 of a four-node cluster, then you must run the ocrconfig -repair -replace command on nodes 2, 3, and 4.

10. Begin to start Oracle Clusterware by running the following command as root on all of the nodes:
# crsctl start crs

11. Verify OCR integrity of all of the cluster nodes that are configured as part of your cluster by running the following CVU command:
$ cluvfy comp ocr -n all -verbose

Example:
$ cluvfy comp ocr -n all -verbose

Verifying OCR integrity 
Checking OCR integrity...
Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations


ASM Running check passed. ASM is running on all specified nodes
Checking OCR config file "/etc/oracle/ocr.loc"...
OCR config file "/etc/oracle/ocr.loc" check successful
Disk group for ocr location "+OCRVD" available on all the nodes

NOTE: 
This check does not verify the integrity of the OCR contents. Execute 'ocrcheck' as a privileged user to verify the contents of OCR.

OCR integrity check passed

Verification of OCR integrity was successful. 



Some Important Outputs during setup activity:

This is a concern, if our ASM Diskgroups consist of only 2 ASM Disks respectively only 2 failgroups like with Extended RAC! Therefore, the newquorum failgroup clause was introduced:
create diskgroup data normal redundancy
 failgroup fg1 disk 'PROD:ASMDISK1'
 failgroup fg2 disk 'PROD:ASMDISK2'
 quorum failgroup fg3 disk 'PROD:ASMDISK3'
 attribute 'compatible.asm' = '11.2.0.0.0';
The failgroup fg3 above needs only one small Disk (300 MB should be on the safe side here, since the Voting File is only about 280 MB in size) to keep one Mirror of the Voting File. fg1 and fg2 will contain each one Voting File and all the other stripes of the Database Area as well, but fg3 will only get that one Voting File.
# crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   45fb190e0cc84f76bf8ca7b7b382e952 (/dev/rhdisk5) [OCRVD]
 2. ONLINE   0d2afd3bd77d4f05bff3f914d88e80dd (/dev/rhdisk6) [OCRVD]
 3. ONLINE   4dad87cb97334f8cbf5dca33ccf11f20 (/dev/rhdisk7) [OCRVD]
Located 3 voting disk(s).

Another important change regarding the Voting File is that it is no longer supported to take a manual backup of it with dd. Instead, the Voting File gets backed up automatically into the OCR. As a New Feature, you can now do a manual backup of the OCR any time you like, without having to wait until that is done automatically – which is also still done:
# ocrconfig -showbackup

racdb01     2016/01/07 08:42:51     /u01/app/11.2.0/grid/cdata/dbprdscan/backup00.ocr
racdb01     2016/01/07 04:42:50     /u01/app/11.2.0/grid/cdata/dbprdscan/backup01.ocr
racdb01     2016/01/07 00:42:50     /u01/app/11.2.0/grid/cdata/dbprdscan/backup02.ocr
racdb01     2016/01/06 08:42:46     /u01/app/11.2.0/grid/cdata/dbprdscan/day.ocr
racdb01     2015/12/26 04:33:23     /u01/app/11.2.0/grid/cdata/dbprdscan/week.ocr

Above are the automatic backups of the OCR as in earlier versions. Now the manual backup:

# cd  /u01/app/11.2.0/grid/bin
# ocrconfig -manualbackup
racdb01     2016/01/06 13:07:03     /u01/app/11.2.0/grid/cdata/cluhesse/backup_20101006_130703.ocr
I got a manual backup on the default location on my master node. We can define another backup location for the automatic backups as well as for the manual backups – preferrable on a Shared Device that is accessible by all the nodes (which is not the case with /home/oracle, unfortunately :-) ):
# /u01/app/11.2.0/grid/bin/ocrconfig -backuploc /home/oracle
# /u01/app/11.2.0/grid/bin/ocrconfig -manualbackup
racdb01     2016/01/06 13:10:50     /home/oracle/backup_20101006_131050.ocr
racdb01     2016/01/06 13:07:03     /u01/app/11.2.0/grid/cdata/cluhesse/backup_20101006_130703.ocr

# /u01/app/11.2.0/grid/bin/ocrconfig -showbackup
racdb01     2016/01/06 09:37:30     /u01/app/11.2.0/grid/cdata/cluhesse/backup00.ocr
racdb01     2016/01/06 05:37:29     /u01/app/11.2.0/grid/cdata/cluhesse/backup01.ocr
racdb01     2016/01/06 01:37:27     /u01/app/11.2.0/grid/cdata/cluhesse/backup02.ocr
racdb01     2010/10/05 01:37:21     /u01/app/11.2.0/grid/cdata/cluhesse/day.ocr
racdb01     2010/10/04 13:37:19     /u01/app/11.2.0/grid/cdata/cluhesse/week.ocr
racdb01     2016/01/06 13:10:50     /home/oracle/backup_20101006_131050.ocr
racdb01     2016/01/06 13:07:03     /u01/app/11.2.0/grid/cdata/cluhesse/backup_20101006_130703.ocr

Conclusion: The way to handle Voting Disk and OCR has changed significantly – they can be kept inside of an ASM Diskgroup especially.



1 comment:

  1. This comment has been removed by a blog administrator.

    ReplyDelete

Translate >>