Create and Manage custom Service in Oracle
11gR2 RAC database database:
We will discuss below points how we will configure and manage custom RAC services to connect database for various purposes:
- create service
- start service
- stop service
- config service
- modify service
Note: I assume that you have all basic ideas about RAC services. Let us see what is the output when we have a default database service. Assume I have a database with name PROD. i.e., It has two RAC instances ( for two node RAC)
Along with we will discuss a workaround on below issues at last:
Fix : PRCR-1013 : Failed to start resource - while start service in RAC database
Along with we will discuss a workaround on below issues at last:
Fix : PRCR-1013 : Failed to start resource - while start service in RAC database
Out put for configuration for default service for my SID :
$ srvctl config database -d PROD
Database unique name: PROD
Database name: PROD
Oracle home:
/oracle/oracle_base/product/dbhome_1
Oracle user: oracle
Spfile: +DATA/PROD/spfilePROD.ora
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: PROD
Database instances: PROD1,PROD2
Disk Groups: DATA,FRA
Mount point paths:
Services:
Type: RAC
Database is administrator managed
Deatils of rac setup:
$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME
TARGET STATE
SERVER
STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.FRA.dg
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.LISTENER.lsnr
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.LISTENER_PROD.lsnr
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.QUORUM.dg
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.asm
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.gsd
OFFLINE OFFLINE rac1
OFFLINE OFFLINE rac2
ora.net1.network
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.ons
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.registry.acfs
ONLINE ONLINE rac1
ONLINE ONLINE rac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE
ONLINE rac2
ora.PROD.db
1 ONLINE
ONLINE rac1
Open
2 ONLINE
ONLINE rac2
Open
ora.cvu
1 ONLINE
ONLINE rac1
ora.rac1.vip
1 ONLINE
ONLINE rac1
ora.rac2.vip
1 ONLINE
ONLINE rac2
ora.oc4j
1 ONLINE
ONLINE rac1
ora.scan1.vip
1 ONLINE
ONLINE rac2
$
1) Create policy managed services:
syntax:
Adds services to a database and
assigns them to instances. If you have multiple instances of a cluster database
on the same node, then always use only one instance on that node for all of the
services that node manages.
Note:
The srvctl add service command does
not accept placement options for Oracle RAC One Node databases.
Syntax and Options
Use the srvctl add service command
to create a service, using the first of the following syntax models, or to
update an existing service, using the second syntax model:
srvctl add service -d db_unique_name -s service_name {-r "preferred_list"
[-a
"available_list"] [-P {BASIC | NONE | PRECONNECT}] | -g server_pool
[-c {UNIFORM |
SINGLETON]} [-k network_number]
[-l [PRIMARY |
PHYSICAL_STANDBY | LOGICAL_STANDBY | SNAPSHOT_STANDBY]
[-y {AUTOMATIC | MANUAL}]
[-q {TRUE | FALSE}] [-x {TRUE | FALSE}]
[-j {SHORT | LONG}][-B
{NONE | SERVICE_TIME | THROUGHPUT}]
[-e {NONE | SESSION |
SELECT}] [-m {NONE | BASIC}] [-z failover_retries]
[-w failover_delay]
srvctl add service -d
db_unique_name -s service_name
-u {-r preferred_list
| -a available_list} [-f]
Service Management Policy:
When you use Oracle Clusterware to
manage your database, you can configure startup options for each individual
database service when you add the service using the srvctl add service command
with the -y option. If you set the management policy for a service to AUTOMATIC
(the default), then the service starts automatically when you start the
database with SRVCTL. If you set the management policy to MANUAL, then the
service does not automatically start, and you must manually start it with
SRVCTL. A MANUAL setting does not prevent Oracle Clusterware from monitoring
the service when it is running and restarting it if a failure occurs. Prior to
Oracle RAC 11g release 2 (11.2), all services worked as though they were
defined with a MANUAL management policy.
Note:
When you use automatic services in
an administrator-managed database, during planned database startup, services
may start on the first instances to start rather than their preferred
instances.
Example from Oracle Doc:
Use this example syntax to add the
gl.example.com service to the my_rac database with AQ HA notifications enabled,
a failover method of BASIC, a Connection Load Balancing Goal of LONG, a
failover type of SELECT, and 180 failover retries with a delay of 5:
$ srvctl add service -d my_rac -s gl.example.com -m BASIC -e SELECT -z 180 -w 5 -j LONG
Use this example syntax to add a named service to a database with preferred instances in list one and available instances in list two, using preconnect failover for the available instances:
srvctl add service -d crm -s sales
-r crm01,crm02 -a crm03 -P PRECONNECT
Example:
$ srvctl add service -d
db_unique_name -s service_name -r preferred_list [-a available_list] [-P
TAF_policy]
$ srvctl add service -d PROD -s hrprod
-r PROD1,PROD2 -a PROD2 -m BASIC -e SELECT -z 180 -w 5 -j LONG
Some issues I faced during service creation. See my errors also:
TRY-1:
$ srvctl add service -d PROD -s hrprod
-r PROD1,PROD2 -a PROD2 -m BASIC -e SELECT -z 180 -w 5 -j LONG
PRKO-2102 : Failed to create server
pool PROD_hrprod: PRCS-1009 : Failed to create server pool ora.PROD_hrprod
PRCR-1071 : Failed to register or
update server pool ora.PROD_hrprod
CRS-2714: 'rac2' is specified
multiple times in 'SERVER_NAMES' of 'ora.PROD_hrprod'
Try-2:
srvctl add service -d PROD -s hrprod
-m BASIC -e SELECT -z 180 -w 5 -j LONG
$ srvctl add service -d PROD -s hrprod -m BASIC -e SELECT -z 180 -w 5 -j LONG
PRKO-3116 : '-g' or '-r' option
should be provided
Correct One:
$ srvctl add service -d PROD -s hrprod
-r PROD1,PROD2 -m BASIC -e SELECT -z 180 -w 5 -j LONG
outpu:
$ crsctl stat res -t
$ crsctl stat res -t
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE
ONLINE rac2
ora.PROD.hrprod.svc
1 OFFLINE
OFFLINE
2 OFFLINE
OFFLINE
ora.PROD.db
1 ONLINE
ONLINE rac1
Open
2 ONLINE
ONLINE rac2
Open
2) Start the services:
$ srvctl start service
Starts a service or multiple services on the specified instance. The srvctl start service command will fail if you attempt to start a service on an instance if that service is already running on its maximum number of instances, that is, its number of preferred instances. You may move a service or change the status of a service on an instance with the srvctl modify service and srvctl relocate service commands described later in this appendix.
Note:All manual service startup must
specify the name of the service to be started by the user.
Syntax and Options:
Use the srvctl start ser
vice command with the following syntax:
vice command with the following syntax:
srvctl start service -d
db_unique_name
[-s "service_name_list" [-n node_name | -i instance_name]]
[-o start_options]
Example:
a) Start service in one node:
$ srvctl start service -d PROD -s hrprod -n rac1 -o open
when service started in one node,
then see the output of 'crsctl stat res -t':
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE
ONLINE rac2
ora.PROD.hrprod.svc
1 ONLINE ONLINE
rac1
2 OFFLINE
OFFLINE
ora.PROD.db
1 ONLINE
ONLINE rac1
Open
2 ONLINE ONLINE
rac2
Open
Now try to start service in other
node:
$ srvctl start service -d PROD -s hrprod
-n rac2 -o open
b) Start service in all nodes at a
time:
$ srvctl start service -d PROD -s hrprod
-o open
see the output of 'crsctl stat res
-t':
--------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE
rac2
ora.PROD.hrprod.svc
1 ONLINE
ONLINE rac1
2 ONLINE
ONLINE rac2
ora.PROD.db
1
ONLINE ONLINE rac1
Open
2 ONLINE
ONLINE rac2
Open
3) Stop service:
srvctl stop service
Stops one or more services globally
across the cluster database, or on the specified instance.
Syntax and Options
Use the srvctl stop service command
with the following syntax:
srvctl stop service -d
db_unique_name [-s "service_name_list"
[-n
node_name | -i instance_name] [-f]
a) Stop service in one node/
instance:
$ srvctl stop service -d PROD -s hrprod -n rac1
$ srvctl stop service -d PROD -s hrprod
-n rac2
b) Stop service in all nodes/
instances:
$ srvctl stop service -d PROD -s hrprod
4) Config service:
$ srvctl config service -d PROD -s hrprod
$ srvctl config service -d PROD -s hrprod
Service name: hrprod
Service is enabled
Server pool: PROD_hrprod
Cardinality: 2
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SELECT
Failover method: BASIC
TAF failover retries: 180
TAF failover delay: 5
Connection Load Balancing Goal:
LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Edition:
Preferred instances: PROD1,PROD2
Available instances:
$
When I created in one my other test database:
$ srvctl config service -d proddb
-s finprod
Service name: finprod
Service is enabled
Server pool: proddb_finprod
Cardinality: 2
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: NONE
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal:
SHORT
Runtime Load Balancing Goal:
SERVICE_TIME
TAF policy specification: NONE
Edition:
Preferred instances: proddb1,proddb2
Available instances:
Best
practice: As per oracle suggestion, if
no standby, then below configuration is good. As per one of production setting, I used below
commands to create service:
$
srvctl add service -d PROD -s hrprod -r PROD1,PROD2 -B SERVICE_TIME -j SHORT
$ srvctl start service -d PROD -s hrprod
$ srvctl config service -d PROD -s hrprod
Service name: hrprod
Service is enabled
Server pool: PROD_hrprod
Cardinality: 2
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: NONE
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: SHORT
Runtime Load Balancing Goal: SERVICE_TIME
TAF policy specification: NONE
Edition:
Preferred instances: PROD1,PROD2
Available instances:
$ srvctl start service -d PROD -s hrprod
$ srvctl config service -d PROD -s hrprod
Service name: hrprod
Service is enabled
Server pool: PROD_hrprod
Cardinality: 2
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: NONE
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: SHORT
Runtime Load Balancing Goal: SERVICE_TIME
TAF policy specification: NONE
Edition:
Preferred instances: PROD1,PROD2
Available instances:
5) Remove service:
srvctl remove service --Removes the configuration for a
service.
Syntax and Options
Use the srvctl remove service
command as follows:
srvctl remove service -d
db_unique_name -s service_name [-i instance_name] [-f]
-f :
Removes the service resource even though the service is
running
Best practice: First stop the
service, then drop it.
$ srvctl remove service -d crm -s
sales
The following example removes the services from specific instances:
$ srvctl remove service -d crm -s
sales -i crm01,crm02
Example:
$ srvctl remove service -d PROD -s hrprod
-f
6) Re-create the services with
different TAF policy:
a) We have created earlier like
below:
$ srvctl add service -d PROD -s hrprod
-r PROD1,PROD2 -m BASIC -e SELECT -z 180 -w 5 -j LONG
b) In below example we are taking
care entire SESSION failover instead of SELECT sessions.
srvctl add service -d PROD -s hrprod
-r PROD1,PROD1 -e SESSION -m BASIC -w 10 -z 150
c) If culster is primary cluster
and you have standby cluster also, then follow below guide lines:
1. On the primary and standby hosts
create the service (oltpworkload) that the application will use to connect to
the database. The service should be created such that it is associated with and
runs on the database when it is in the ‘PRIMARY’ database role:
Primary cluster:
srvctl add service -d Austin -s
oltpworkload -r ssa1,ssa2,ssa3,ssa4 -l PRIMARY -q TRUE -e SESSION -m BASIC -w
10 -z 150
Standby cluster:
srvctl add service -d Houston -s
oltpworkload -r ssb1,ssb2,ssb3,ssb4 -l PRIMARY -q TRUE -e SESSION -m BASIC -w
10 -z 150
2. If the standby is also going to
support read-only reporting applications, then create a service specific for
this workload (reports) that will start when the database is in
PHYSICAL_STANDBY role.
Primary cluster:
srvctl add service -d Austin -s
reports -r ssa1,ssa2,ssa3,ssa4 -l PHYSICAL_STANDBY -q TRUE -e SESSION -m BASIC
-w 10 -z 150
Standby cluster:
srvctl add service -d Houston -s
reports -r ssb1,ssb2,ssb3,ssb4 -l PHYSICAL_STANDBY -q TRUE -e SESSION -m BASIC
-w 10 -z 150
In addition to creating the
database service “reports” on both clusters, the following SQL statement must
also be run on the primary database so that the service definition is
transmitted via the redo stream and applied to the physical standby database:
SQL run at the Primary database:
EXECUTE
DBMS_SERVICE.CREATE_SERVICE('reports', 'reports', NULL, NULL,TRUE, 'BASIC',
'SESSION', 150, 10, NULL);
7) Configure an Oracle Net :
primary & standby cluster
Configure an Oracle Net alias that
the OCI application will use to connect to the database. The Oracle Net alias
should specify both the primary and standby SCAN hostnames. For best
performance while creating new connections the Oracle Net alias should have
LOAD_BALANCE=OFF for the DESCRIPTION_LIST so that DESCRIPTIONs are tried in an
ordered list, top to bottom. With this configuration the second DESCRIPTION is
only attempted if all connection attempts to the first DESCRIPTION have failed.
SALES=
(DESCRIPTION_LIST=
(LOAD_BALANCE=off)
(FAILOVER=on)
(DESCRIPTION=
(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
(ADDRESS_LIST=
(LOAD_BALANCE=on)
(ADDRESS=(PROTOCOL=TCP)(HOST=Austin-scan)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME=oltpworkload)))
(DESCRIPTION=
(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
(ADDRESS_LIST=
(LOAD_BALANCE=on)
(ADDRESS=(PROTOCOL=TCP)(HOST= Houston-scan)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME=oltpworkload))))
When a new connection is made using
the above Oracle Net alias the following logic is used:
a) Oracle Net contacts DNS and
resolves Austin-scan to a total of 3 IP addresses.
b) Oracle Net randomly picks one of
the 3 IP address and attempts to make a connection. If the connection attempt
to the IP address does not respond in 3 seconds (TRANSPORT_CONNECT_TIMEOUT) the
next IP address is attempted. All 3 IP addresses will be tried a total of four
times (initial attempt plus RETRY_COUNT in the above example).
c) If the connection to primary
site is unsuccessful, it then contacts DNS and resolves Houston-scan to 3
addresses.
d) The same sequence is performed
for the standby Houston-scan as it was for the Austin-scan.
When a new connection is made using
the above Oracle Net alias the following logic is used:
a) Oracle Net contacts DNS and
resolves Austin-scan to a total of 3 IP addresses.
b) Oracle Net randomly picks one of
the 3 IP address and attempts to make a connection. If the connection attempt
to the IP address does not respond in 3 seconds (TRANSPORT_CONNECT_TIMEOUT) the
next IP address is attempted. All 3 IP addresses will be tried a total of four
times (initial attempt plus RETRY_COUNT in the above example).
c) If the connection to primary
site is unsuccessful, it then contacts DNS and resolves Houston-scan to 3
addresses.
d) The same sequence is performed
for the standby Houston-scan as it was for the Austin-scan.
Additional information on the
Oracle Net parameters used in the above alias:
LOAD_BALANCE is ON by default for
DESCRIPTION_LIST only. This parameter by default is OFF for an address list
within a DESCRIPTION. Setting this ON for a SCAN-based address implies that new
connections will be randomly assigned to one of the 3 SCAN-based IP addresses
resolved by DNS.
In certain situations, round-robin
address assignment by DNS may not be possible - see the Oracle Database
11.2.0.2 Readme. The best practice to ensure connect-time client load balancing
across the 3 SCAN IP addresses is to explicitly specify LOAD_BALANCE=on. Note
that this behavior is independent of server-side load balancing which will
occur subsequently, after the initial SCAN listener receives the connection
request.
The default value for the FAILOVER
parameter is ON for an address list within a DESCRIPTION. This impacts the 3
SCAN IP addresses the same way as if those 3 IP addresses were listed
explicitly in the connect descriptor. This means that if the initial connection
requests to the first randomly-assigned SCAN IP address fails, the connection
will failover to another SCAN IP address, and will continue to do so, till it
iterates the complete address list. Note that this parameter is relevant only
to new connections. Failover of existing connections is handled by TAF, which
is controlled by the separate FAILOVER_MODE parameter.
The CONNECT_TIMEOUT parameter is
the time to connect to the database instance providing the requested service,
and includes the time to establish a TCP connection to the listener. The TCP
duration is controlled by TRANSPORT_CONNECT_TIMEOUT, which has a default value
of 60 seconds. If both timeouts are specified, it is recommended that
CONNECT_TIMEOUT be set to a value slightly greater than
TRANSPORT_CONNECT_TIMEOUT. The timeout interval is applicable for each ADDRESS
in an ADDRESS_LIST, and each IP address to which a host name is mapped. Set the
CONNECT_TIMEOUT parameter to the maximum amount of time (in seconds) to wait
for a response from an address before skipping to the next address. A setting
of three seconds is recommended and is acceptable in most cases. Do not set
this parameter to fewer than three seconds.
The equivalent global parameter in
sqlnet.ora is SQLNET.OUTBOUND_CONNECT_TIMEOUT. If the same timeout value is
sufficient for all connect strings, it would be simpler to set the global
parameter. Otherwise, a separate setting can be done for each connect string.
The equivalent global parameter for
TRANSPORT_CONNECT_TIMEOUT is TCP.CONNECT_TIMEOUT. Both these parameters are
applicable only when the protocol is TCP.
The RETRY_COUNT parameter specifies
the number of times an address list is traversed before the new connection
attempt is terminated. The default value is 0. With respect to SCAN, with
FAILOVER = on, setting this RETRY_COUNT parameter to a value of 2 (for
example), means the 3 SCAN IP addresses are traversed thrice (i.e. 3*3=9
connect attempts), before the connection is terminated:
> When the
connection request initially comes in, the first randomly assigned IP address
tries to service that request, followed by the two remaining IP addresses (this
behavior is controlled by the FAILOVER parameter);
> The retries
then kick in and the list of 3 IP addresses is tried two more times.
RETRY_COUNT is only supported at DESCRIPTION level in connect string, but not
at global (i.e. sqlnet.ora) level.
Incident/ Issues to start a service:- Test case scenario
When unable to start a service:
Fix : PRCR-1013 : Failed to start
resource - while start service in RAC database
Suddenly I found one of the service
which is created for load-balancing TAF purpose, is down. When I tried to start
it, then below error came.
$ srvctl start service -d proddb -s
finprod -n racdb2 -o open
PRCR-1013 : Failed to start
resource ora.proddb.finprod.svc
PRCR-1064 : Failed to start
resource ora.proddb.finprod.svc on node racdb2
CRS-5017: The resource action
"ora.proddb.finprod.svc start" encountered the following error:
ORA-06550: line 1, column 34:
PLS-00553: character set name is
not recognized
ORA-06550: line 0, column 0:
PL/SQL: Compilation unit analysis
terminated
. For details refer to
"(:CLSN00107:)" in "/u01/app/grid4/log/racdb2/agent/crsd/oraagent_oracle/oraagent_oracle.log".
CRS-2674: Start of 'ora.proddb.finprod.svc'
on 'racdb2' failed
CRS-5017: The resource action
"ora.proddb.finprod.svc clean" encountered the following error:
ORA-06550: line 1, column 33:
PLS-00553: character set name is
not recognized
ORA-06550: line 0, column 0:
PL/SQL: Compilation unit analysis
terminated
. For details refer to
"(:CLSN00106:)" in "/u01/app/grid4/log/racdb2/agent/crsd/oraagent_oracle/oraagent_oracle.log".
Now read the error written in
oraagent_oracle.log file. Best way is raise SR and follow it immediately.
Collect details:
script /tmp/out.1
echo "Took on server
`hostname` at `date` as `id`"
crsctl stat res -t
srvctl stop service -d proddb -s finprod
crsctl stat res -t
### make sure the service is
stopped at this point
export SRVM_TRACE=true
srvctl start service -d proddb -s finprod
date
crsctl stat res -t
date
#### service should be still down
since its not working this way.
srvctl start service -d proddb -s finprod
-i proddb2
crsctl stat res -t
date
exit
Workaround:
I did below activities and resolved
the issues:
Before any activity, output of
"crsctl stat res -t":
--------------------------------------------------------------------------------
NAME
TARGET STATE
SERVER
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
ONLINE ONLINE racdb1
ONLINE ONLINE racdb2
ora.FRA.dg
ONLINE ONLINE racdb1
ONLINE ONLINE racdb2
ora.LISTENER.lsnr
ONLINE ONLINE racdb1
ONLINE ONLINE racdb2
ora.asm
ONLINE ONLINE racdb1
Started
ONLINE ONLINE racdb2
Started
ora.gsd
OFFLINE OFFLINE racdb1
OFFLINE OFFLINE racdb2
ora.net1.network
ONLINE ONLINE racdb1
ONLINE ONLINE racdb2
ora.ons
ONLINE ONLINE racdb1
ONLINE ONLINE racdb2
ora.registry.acfs
ONLINE ONLINE racdb1
ONLINE ONLINE racdb2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE
ONLINE racdb2
ora.LISTENER_SCAN2.lsnr
1 ONLINE
ONLINE racdb1
ora.LISTENER_SCAN3.lsnr
1 ONLINE
ONLINE racdb1
ora.cvu
1 ONLINE
ONLINE racdb1
ora.racdb1.vip
1 ONLINE
ONLINE racdb1
ora.racdb2.vip
1 ONLINE
ONLINE racdb2
ora.proddb.db
1 ONLINE
ONLINE racdb1
Open
2 ONLINE
ONLINE racdb2
Open
ora.proddb.finprod.svc
1 ONLINE
ONLINE racdb1
2 ONLINE
OFFLINE
ora.oc4j
1 ONLINE
ONLINE racdb1
ora.scan1.vip
1 ONLINE
ONLINE racdb2
ora.scan2.vip
1 ONLINE
ONLINE racdb1
ora.scan3.vip
1 ONLINE
ONLINE racdb1
Here service is not running in
node2. In very first step stop entire service in all nodes and start it again.
If same issue is comming you can follow below practice.
1. Ensure that the database is
properly started up, that is, mounted and opened.
2. Find PID and kill the process:
cd $GRID_HOME/log/nodename/agent/crsd/oraagent_oracle
cd $GRID_HOME/log/nodename/agent/crsd/oraagent_oracle
cat oraagent_oracle.pid
You should see a pid number in the
file.
ps -ef |grep <pid> ----->
verify if this pid is the oraagent.bin process.
if it is the oraagent.bin process
kill it.
kill -9 <pid>
3. If not ps -ef|grep d.bin. You
should find two oraagent.bin processes kill these.
Killing the process should not have
any effect and they should respawn again automatically.
Confirm that the processes have
started.
4. Then restart the service using
srvctl.
srvctl start service -d
<database_name> -s <service_name> -i <instance_name>
Activity done in my test env.:
cd /u01/app/grid4/log/racdb2/agent/crsd/oraagent_oracle
$ cat oraagent_oracle.pid
7798818
$ ps -ef|grep 7798818
$ ps -ef|grep 7798818
oracle
7798818 1 2
Jul 22 - 43:22 /u01/app/grid4/bin/oraagent.bin
oracle 16974300
36307328 0 17:35:08 pts/4 0:00 grep 7798818
kill -9 7798818
ps -ef|grep d.bin
oracle
5767238 1 0
Jul 22 - 7:16 /u01/app/grid4/bin/evmd.bin
oracle
5832844 1 0
Jul 22 - 1:45 /u01/app/grid4/bin/gpnpd.bin
root
5963902 1 0
Jul 22 - 15:07 /u01/app/grid4/bin/ohasd.bin
reboot
oracle
8192028 1 0
Jul 22 - 0:05 /u01/app/grid4/bin/mdnsd.bin
oracle 6291842
8585614 3 Jul 22 - 70:27
/u01/app/grid4/bin/ocssd.bin
root
7668170 1 0
Jul 22 - 7:40 /u01/app/grid4/bin/octssd.bin
reboot
root
26411340 1 3
Jul 23 - 41:36 /u01/app/grid4/bin/osysmond.bin
oracle 28049762
36307328 0 17:36:52 pts/4 0:00 grep d.bin
root
45875706 1 0
Jul 22 - 18:22 /u01/app/grid4/bin/crsd.bin reboot
oracle
51052946 1 0
Jul 22 - 10:53 /u01/app/grid4/bin/gipcd.bin
Note : There is no pid value as we
killed.
Now start the service again:
$ srvctl start service -d proddb -s finprod
-n racdb2 -o open
Yah. It is started. Now see the
"crsctl stat res -t" output:
$ crsctl stat res -t
--------------------------------------------------------------------
NAME
TARGET STATE
SERVER
STATE_DETAILS
--------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------
ora.DATA.dg
ONLINE ONLINE racdb1
ONLINE ONLINE racdb2
ora.FRA.dg
ONLINE ONLINE racdb1
ONLINE ONLINE racdb2
ora.LISTENER.lsnr
ONLINE ONLINE racdb1
ONLINE ONLINE racdb2
ora.asm
ONLINE ONLINE racdb1
Started
ONLINE ONLINE racdb2
Started
ora.gsd
OFFLINE OFFLINE racdb1
OFFLINE OFFLINE racdb2
ora.net1.network
ONLINE ONLINE racdb1
ONLINE ONLINE racdb2
ora.ons
ONLINE ONLINE racdb1
ONLINE ONLINE racdb2
ora.registry.acfs
ONLINE ONLINE racdb1
ONLINE ONLINE racdb2
--------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE
ONLINE racdb2
ora.LISTENER_SCAN2.lsnr
1 ONLINE
ONLINE racdb1
ora.LISTENER_SCAN3.lsnr
1 ONLINE
ONLINE racdb1
ora.cvu
1 ONLINE
ONLINE racdb1
ora.racdb1.vip
1 ONLINE
ONLINE racdb1
ora.racdb2.vip
1 ONLINE
ONLINE racdb2
ora.proddb.db
1 ONLINE
ONLINE racdb1
Open
2 ONLINE
ONLINE racdb2
Open
ora.proddb.finprod.svc
1 ONLINE
ONLINE racdb1
2 ONLINE
ONLINE racdb2
ora.oc4j
1 ONLINE
ONLINE racdb1
ora.scan1.vip
1 ONLINE
ONLINE racdb2
ora.scan2.vip
1 ONLINE
ONLINE racdb1
ora.scan3.vip
1 ONLINE
ONLINE racdb1
Service Name Usage more points:
If you have a RAC database, you may have introduced a few features to support some of the advanced functionalities of RAC, e.g., load balancing across nodes and making sessions fail over to a surviving node when a node dies. These things will occur only if users are employing the proper connect string, utilizing SERVICE_NAME and not SID. Here is one way to use connect strings, utilizing service name "finprod" connecting to the RAC database running on three nodes – rac1, rac2 and rac3.
PROD =
(DESCRIPTION =
(LOAD_BALANCE = on)
(FAILOVER = on)
(enable = broken)
(ADDRESS = (PROTOCOL = TCP)(HOST = racdb1)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = racdb2)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = racdb3)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = finprod)
(FAILOVER_MODE =
(TYPE = SELECT)
(METHOD = BASIC)
(RETRIES = 120)
(DELAY = 2)
)
)
)
Key Note on database service:
Service manage in Oracle RAC database- Command List
(Keys and description)
-d Database Name
-i Instance Name
-s Service Name
-n Node Name
-r Preferred list
-a Available list
-v Verbose
-p policy {AUTOMATIC | MANUAL}
-t TAF
-f Force stop or relocate service operations
-- Check status of database and list services:
$ srvctl status database -d db_unique_name -v
Or using View :
SQL> select * from gv$active_services
-- View configuration of services
$ srvctl config service -d DB2 –a
-- Add service
$ srvctl add service -d PROD -s OLTP_SVC -r PROD1,PROD2 -a PROD3
$ srvctl add service -d PROD -s BATCH_SVC -r PROD3 -a PROD1,PROD2
-- Modify service
$ srvctl modify service -s service_name -d db_unique_name -n -i pref_inst_list [ -a avail_inst_list] [-f]
e.g., $ srvctl modify service -s OLTP_SVC -d PROD -n -i PROD3 -a PROD1,PROD2
-- Moving a service member from one instance to another (move and change configuration):
$ srvctl modify service -s PROD_UK_SEARCH -d PROD -i PROD1 -t PROD2
-- Relocate service (not changing configuration):
$ srvctl relocate service -d PROD -s PROD_OPERATIONS_SERVICE -i PROD1 -t PROD3
-- Enable|disable|start|stop service
$ srvctl enable|disable|start|stop service -s service_name -d db_unique_name
-- Remove Service:
$ srvctl remove service -d PROD -s PROD_UK_SEARCH -i PROD5
-- Create Service and TAF:
$ srvctl add service –d PROD –s crmprd -r node1, node2 -B SERVICE TIME –j SHORT
$ srvctl start service –d PROD –s crmprd [- w – failover delay, -z -> failover retries
$ srvctl add service -d PROD -s hrprod -r PROD1,PROD2 -m BASIC -e SELECT -z 180 -w 5 -j LONG
Reference :
http://docs.oracle.com/cd/E11882_01/rac.112/e41960/srvctladmin.htm
excellent post on rac
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteNever seen any docs like this. Appreciate your efforts.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDelete