Monday, August 23, 2010

Restoring Vote disk due to ASM disk failures - 1

Restoring OCR due to ASM disk failures was blogged previously. This blog looks at restoring vote disks due to ASM disk failures.

Some important pointers from Clusterware Admin Guide. "OCR and voting disks can be stored in Oracle Automatic Storage Management (Oracle ASM). The Oracle ASM partnership and status table (PST) is replicated on multiple disks and is extended to store OCR. Consequently, OCR can tolerate the loss of the same number of disks as are in the underlying disk group and be relocated in response to disk failures.

Oracle ASM reserves several blocks at a fixed location on every Oracle ASM disk for storing the voting disk. Should the disk holding the voting disk fail, Oracle ASM selects another disk on which to store this data. Storing OCR and the voting disk on Oracle ASM eliminates the need for third-party cluster volume managers and eliminates the complexity of managing disk partitions for OCR and voting disks in Oracle Clusterware installations.

The dd commands used to back up and recover voting disks in previous versions of Oracle Clusterware are not supported in Oracle Clusterware 11g release 2 (11.2).

Voting disk management requires a valid and working OCR. Before you add, delete, replace, or restore voting disks, run the ocrcheck command as root. If OCR is not available or it is corrupt, then you must restore OCR.

The voting disk data is automatically backed up in OCR as part of any configuration change and is automatically restored to any voting disk added.

If all of the voting disks are corrupted, then Restore OCR. This step is necessary only if OCR is also corrupted or otherwise unavailable, such as if OCR is on Oracle ASM and the disk group is no longer available.

The number of voting files you can store in a particular Oracle ASM disk group depends upon the redundancy of the disk group.
External redundancy: A disk group with external redundancy can store only one voting disk
Normal redundancy: A disk group with normal redundancy stores three voting disks
High redundancy: A disk group with high redundancy stores five voting disks

By default, Oracle ASM puts each voting disk in its own failure group within the disk group. A normal redundancy disk group must contain at least two failure groups but if you are storing your voting disks on Oracle ASM, then a normal redundancy disk group must contain at least three failure groups.
" (Oracle Clusterware Admin Guide)

Following error will be thrown if a diskgroup with two failure groups was used to store the vote disks
crsctl replace votedisk +CLUSTERDG
Failed to create voting files on disk group CLUSTERDG.
Change to configuration failed, but was successfully rolled back.
CRS-4000: Command Replace failed, or completed with errors.
On the ASM log
Thu Aug 19 15:34:29 2010
NOTE: updated gpnp profile ASM diskstring: ORCL:*
Thu Aug 19 15:34:29 2010
NOTE: Creating voting files in diskgroup CLUSTERDG
Thu Aug 19 15:34:29 2010
NOTE: Voting File refresh pending for group 1/0x23d6dd4 (CLUSTERDG)
NOTE: Attempting voting file creation in diskgroup CLUSTERDG
NOTE: voting file allocation on grp 1 disk CLUS2
NOTE: voting file allocation on grp 1 disk CLUS3
ERROR: Voting file allocation failed for group CLUSTERDG
Errors in file /opt/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_20421.trc:
ORA-15273: Could not create the required number of voting files.
NOTE: Voting file relocation is required in diskgroup CLUSTERDG
NOTE: Attempting voting file relocation on diskgroup CLUSTERDG
NOTE: voting file deletion on grp 1 disk CLUS2
NOTE: voting file deletion on grp 1 disk CLUS3
"A high redundancy disk group must contain at least three failure groups. However, Oracle recommends using several failure groups. A small number of failure groups, or failure groups of uneven capacity, can create allocation problems that prevent full use of all of the available storage.
You must specify enough failure groups in each disk group to support the redundancy type for that disk group.

Neither should you add a voting disk to a cluster file system in addition to the voting disks stored in an Oracle ASM disk group. Oracle does not support having voting disks in Oracle ASM and directly on a cluster file system for the same cluster at the same time.
"(Oracle Clusterware Admin Guide)

Scenario 1.
1. Only Vote disks are in ASM diskgroup
2. ASM diskgroup has normal redundancy with only three failure groups
3. Only one failure group is affected
4. OCR is located in a separate location (in another diskgroup or block device location - not supported by Oracle only valid during migration could be moved to after installation for testing purposes)

All that is required in this scenario is to drop the affected disk from the ASM diskgroup, repair it and add it back to the diskgroup.

1. It is assumed thatvote disks are already in ASM diskgroup, if not move to ASM diskgroup. Current vote disk configuration is as follows
crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 393aebdd039e4fdabf242bf461c45136 (ORCL:CLUS1) [CLUSTERDG]
2. ONLINE 04daf60741d34f9abf22e76fe22913b2 (ORCL:CLUS2) [CLUSTERDG]
3. ONLINE 2f38bb8318d34f1bbf49d28f6e400f60 (ORCL:CLUS3) [CLUSTERDG]
Located 3 voting disk(s).
2. Corrupt one of the disks to simulate disk failure
/etc/init.d/oracleasm querydisk -p CLUS1
Disk "CLUS1" is a valid ASM disk
/dev/sdc2: LABEL="CLUS1" TYPE="oracleasm"

dd if=/dev/zero of=/dev/sdc2 count=20480 bs=8192
20480+0 records in
20480+0 records out
167772160 bytes (168 MB) copied, 0.162996 seconds, 1.0 GB/s
3. This has no effect on the OCR. In this case OCR is stored in a block device (not supported by Oracle)
ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 148348
Used space (kbytes) : 4580
Available space (kbytes) : 143768
ID : 552644455
Device/File Name : /dev/sdc5
Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded
Logical corruption check succeeded
4. Grid Infrastructure alert log ($CRS_HOME/log/`hostname -s`/alert`hostname -s`.log) shows the following entry after some time
2010-08-19 15:51:56.537
[cssd(25119)]CRS-1605:CSSD voting file is online: ORCL:CLUS1; details in /opt/app/11.2.0/grid/log/hpc1/cssd/ocssd.log.
5. OCSSD log list the following
2010-08-19 15:51:44.790: [    CSSD][1230346560]clssnmSendingThread: sending status msg to all nodes
2010-08-19 15:51:44.790: [ CSSD][1230346560]clssnmSendingThread: sent 5 status msgs to all nodes
....
2010-08-19 15:51:49.790: [ CSSD][1230346560]clssnmSendingThread: sending status msg to all nodes
2010-08-19 15:51:49.790: [ CSSD][1230346560]clssnmSendingThread: sent 5 status msgs to all nodes
2010-08-19 15:51:53.513: [ CSSD][1303775552]clssnmvDiskKillCheck: voting disk corrupted (0x00000000,0x00000000) (ORCL:CLUS1)
2010-08-19 15:51:53.513: [ CSSD][1303775552]clssnmvDiskAvailabilityChange: voting file ORCL:CLUS1 now offline
2010-08-19 15:51:53.816: [ CLSF][1366714688]Closing handle:0x1caa6570
2010-08-19 15:51:53.816: [ SKGFD][1366714688]Lib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: closing handle 0x1ccf45c0 for
disk :ORCL:CLUS1:
2010-08-19 15:51:54.515: [ CLSF][1303775552]Closing handle:0x1cd496d0
2010-08-19 15:51:54.515: [ SKGFD][1303775552]Lib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: closing handle 0x1cd48e80 for
disk :ORCL:CLUS1:
2010-08-19 15:51:54.516: [ CLSF][1324755264]Closing handle:0x1c958930
2010-08-19 15:51:54.516: [ SKGFD][1324755264]Lib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: closing handle 0x1cac17c0 for
disk :ORCL:CLUS1:
2010-08-19 15:51:54.790: [ CSSD][1230346560]clssnmSendingThread: sending status msg to all nodes
2010-08-19 15:51:54.790: [ CSSD][1230346560]clssnmSendingThread: sent 5 status msgs to all nodes
2010-08-19 15:51:54.799: [ CSSD][1146427712]clssnmvSchedDiskThreads: DiskPingThread for voting file ORCL:CLUS1 sched delay 2300 > margin 1500 cur_ms 527387044 lastalive 527384744
2010-08-19 15:51:56.523: [ CSSD][1324755264]clssnmvDiskOpen: Opening ORCL:CLUS1
2010-08-19 15:51:56.523: [ SKGFD][1324755264]Handle 0x1cac17c0 from lib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: for disk :ORCL:CLUS1:
2010-08-19 15:51:56.523: [ CLSF][1324755264]Opened hdl:0x1ca4d320 for dev:ORCL:CLUS1:
2010-08-19 15:51:56.533: [ CSSD][1324755264]clssnmvStatusBlkInit: myinfo nodename hpc1, uniqueness 1281715015
2010-08-19 15:51:56.533: [ CSSD][1324755264]clssnmvDiskAvailabilityChange: voting file ORCL:CLUS1 now online
2010-08-19 15:51:56.533: [ SKGFD][1366714688]Handle 0x1cd48e80 from lib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: for disk :ORCL:CLUS1:
2010-08-19 15:51:56.533: [ CLSF][1366714688]Opened hdl:0x1cba4ac0 for dev:ORCL:CLUS1:
2010-08-19 15:51:57.524: [ SKGFD][1303775552]Handle 0x1ccf45c0 from lib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: for disk :ORCL:CLUS1:
2010-08-19 15:51:57.524: [ CLSF][1303775552]Opened hdl:0x1cd496d0 for dev:ORCL:CLUS1:
2010-08-19 15:51:59.790: [ CSSD][1230346560]clssnmSendingThread: sending status msg to all nodes
2010-08-19 15:51:59.790: [ CSSD][1230346560]clssnmSendingThread: sent 5 status msgs to all nodes
2010-08-19 15:52:03.403: [ CSSD][1146427712]clssscSelect: cookie accept request 0x1c75c298
2010-08-19 15:52:03.403: [ CSSD][1146427712]clssgmAllocProc: (0x2aaab421d380) allocated
2010-08-19 15:52:03.403: [ CSSD][1146427712]clssgmClientConnectMsg: properties of cmProc 0x2aaab421d380 - 1,2,3,4
2010-08-19 15:52:03.403: [ CSSD][1146427712]clssgmClientConnectMsg: Connect from con(0x41ee9f) proc(0x2aaab421d380) pid(21407)
version 11:2:1:4, properties: 1,2,3,4
2010-08-19 15:52:03.403: [ CSSD][1146427712]clssgmClientConnectMsg: msg flags 0x0000
2010-08-19 15:52:03.404: [ CSSD][1146427712]clssgmExecuteClientRequest: Node name request from client ((nil))
2010-08-19 15:52:03.404: [ CSSD][1146427712]clssgmExecuteClientRequest: VOTEDISKQUERY recvd from proc 29 (0x2aaab421d380)
2010-08-19 15:52:03.405: [ CSSD][1146427712]clssgmDeadProc: proc 0x2aaab421d380
2010-08-19 15:52:03.405: [ CSSD][1146427712]clssgmDestroyProc: cleaning up proc(0x2aaab421d380) con(0x41ee9f) skgpid ospid 21407 with 0 clients, refcount 0
2010-08-19 15:52:03.405: [ CSSD][1146427712]clssgmDiscEndpcl: gipcDestroy 0x41ee9f
2010-08-19 15:52:04.790: [ CSSD][1230346560]clssnmSendingThread: sending status msg to all nodes
2010-08-19 15:52:04.790: [ CSSD][1230346560]clssnmSendingThread: sent 5 status msgs to all nodes
2010-08-19 15:52:09.790: [ CSSD][1230346560]clssnmSendingThread: sending status msg to all nodes
2010-08-19 15:52:09.790: [ CSSD][1230346560]clssnmSendingThread: sent 5 status msgs to all nodes
2010-08-19 15:52:14.790: [ CSSD][1230346560]clssnmSendingThread: sending status msg to all nodes
2010-08-19 15:52:14.790: [ CSSD][1230346560]clssnmSendingThread: sent 5 status msgs to all nodes
2010-08-19 15:52:18.260: [ CSSD][1146427712]clssscSelect: cookie accept request 0x1c75c298
2010-08-19 15:52:18.260: [ CSSD][1146427712]clssgmAllocProc: (0x2aaab4261540) allocated
2010-08-19 15:52:18.260: [ CSSD][1146427712]clssgmClientConnectMsg: properties of cmProc 0x2aaab4261540 - 1,2,3,4
2010-08-19 15:52:18.260: [ CSSD][1146427712]clssgmClientConnectMsg: Connect from con(0x41ef82) proc(0x2aaab4261540) pid(21428)
version 11:2:1:4, properties: 1,2,3,4
2010-08-19 15:52:18.260: [ CSSD][1146427712]clssgmClientConnectMsg: msg flags 0x0000
2010-08-19 15:52:18.261: [ CSSD][1146427712]clssgmExecuteClientRequest: Node name request from client ((nil))
2010-08-19 15:52:18.262: [ CSSD][1146427712]clssgmExecuteClientRequest: VOTEDISKQUERY recvd from proc 29 (0x2aaab4261540)
2010-08-19 15:52:18.263: [ CSSD][1146427712]clssgmDeadProc: proc 0x2aaab4261540
2010-08-19 15:52:18.263: [ CSSD][1146427712]clssgmDestroyProc: cleaning up proc(0x2aaab4261540) con(0x41ef82) skgpid ospid 21428 with 0 clients, refcount 0
2010-08-19 15:52:18.263: [ CSSD][1146427712]clssgmDiscEndpcl: gipcDestroy 0x41ef82
2010-08-19 15:52:18.788: [ CSSD][1230346560]clssnmSendingThread: sending status msg to all nodes
2010-08-19 15:52:18.788: [ CSSD][1230346560]clssnmSendingThread: sent 4 status msgs to all nodes
2010-08-19 15:52:23.788: [ CSSD][1230346560]clssnmSendingThread: sending status msg to all nodes
Querying vote disk will show all three vote disks to be online
crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 393aebdd039e4fdabf242bf461c45136 (ORCL:CLUS1) [CLUSTERDG]
2. ONLINE 04daf60741d34f9abf22e76fe22913b2 (ORCL:CLUS2) [CLUSTERDG]
3. ONLINE 2f38bb8318d34f1bbf49d28f6e400f60 (ORCL:CLUS3) [CLUSTERDG]
6. Stop and start the crs to see the effects of a restart
crsctl stop crs
crsctl start crs
While restarting the ocssd.log will show discovering on two vote disks instead of three
2010-08-19 15:58:12.758: [    CSSD][1145928000]clssnmvDiskVerify: file is not a voting file, cannot recognize on-disk signature for a voting
2010-08-19 15:58:12.758: [ CSSD][1145928000]clssnmvDiskVerify: file is not a voting file, cannot recognize on-disk signature for a voting
2010-08-19 15:58:12.758: [ CSSD][1145928000]clssnmvDiskVerify: Successful discovery of 2 disks
2010-08-19 15:58:12.758: [ CSSD][1145928000]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
2010-08-19 15:58:12.758: [ CSSD][1145928000]clssnmCompleteVFDiscovery: Completing voting file discovery
and also the crs alert log
2010-08-19 15:58:22.999
[cssd(6264)]CRS-1605:CSSD voting file is online: ORCL:CLUS3; details in /opt/app/11.2.0/grid/log/hpc1/cssd/ocssd.log.
2010-08-19 15:58:23.035
[cssd(6264)]CRS-1605:CSSD voting file is online: ORCL:CLUS2; details in /opt/app/11.2.0/grid/log/hpc1/cssd/ocssd.log.
At the end cluster stack will start but only with two vote disks
crsctl check css
CRS-4529: Cluster Synchronization Services is online
[root@hpc1 oracle +ASM1]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. OFFLINE 393aebdd039e4fdabf242bf461c45136 () []
2. ONLINE 04daf60741d34f9abf22e76fe22913b2 (ORCL:CLUS2) [CLUSTERDG]
3. ONLINE 2f38bb8318d34f1bbf49d28f6e400f60 (ORCL:CLUS3) [CLUSTERDG]
7. The diskgroup containing the vote disk will be in a dismounted state after the restart.
select name,state from v$asm_diskgroup;
STATE NAME
----------- ---------
DISMOUNTED CLUSTERDG
From crsctl stat
HA Resource                                   Target     State
----------- ------ -----
ora.CLUSTERDG.dg ONLINE OFFLINE
ora.DATA.dg ONLINE ONLINE on hpc1
ora.FLASH.dg ONLINE ONLINE on hpc1
8. Reason for the diskgroup not beign mounted is the failed disk. Following from the ASM alert log
SQL> ALTER DISKGROUP ALL MOUNT /* asm agent */
NOTE: Diskgroup used for Voting files is:
CLUSTERDG
NOTE: cache registered group CLUSTERDG number=1 incarn=0x909ef75f
NOTE: cache began mount (first) of group CLUSTERDG number=1 incarn=0x909ef75f
NOTE: Loaded library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so
NOTE: Assigning number (1,1) to disk (ORCL:CLUS2)
NOTE: Assigning number (1,2) to disk (ORCL:CLUS3)
NOTE: group CLUSTERDG: updated PST location: disk 0001 (PST copy 0)
NOTE: group CLUSTERDG: updated PST location: disk 0002 (PST copy 1)
NOTE: group CLUSTERDG: updated PST location: disk 0001 (PST copy 0)
NOTE: group CLUSTERDG: updated PST location: disk 0002 (PST copy 1)
NOTE: start heartbeating (grp 1)
kfdp_query(CLUSTERDG): 3
kfdp_queryBg(): 3
NOTE: group CLUSTERDG: updated PST location: disk 0001 (PST copy 0)
NOTE: group CLUSTERDG: updated PST location: disk 0002 (PST copy 1)
NOTE: Assigning number (1,0) to disk ()
kfdp_query(CLUSTERDG): 4
kfdp_queryBg(): 4
NOTE: group CLUSTERDG: updated PST location: disk 0001 (PST copy 0)
NOTE: group CLUSTERDG: updated PST location: disk 0002 (PST copy 1)
NOTE: cache dismounting (clean) group 1/0x909EF75F (CLUSTERDG)
NOTE: dbwr not being msg'd to dismount
NOTE: lgwr not being msg'd to dismount
NOTE: cache dismounted group 1/0x909EF75F (CLUSTERDG)
NOTE: cache ending mount (fail) of group CLUSTERDG number=1 incarn=0x909ef75f
kfdp_dismount(): 5
kfdp_dismountBg(): 5
NOTE: De-assigning number (1,0) from disk ()
NOTE: De-assigning number (1,1) from disk (ORCL:CLUS2)
NOTE: De-assigning number (1,2) from disk (ORCL:CLUS3)
ERROR: diskgroup CLUSTERDG was not mounted
NOTE: cache deleting context for group CLUSTERDG 1/-1868630177
WARNING: Disk Group CLUSTERDG containing voting files is not mounted
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "0" is missing from group number "1"
ERROR: ALTER DISKGROUP ALL MOUNT /* asm agent */
Since normal redundancy diskgroups can tolarate a failure of one failure group this diskgroup could be mounted with force option.
SQL> alter diskgroup clusterdg mount force;

Diskgroup altered.
9. Once the diskgroup is mounted drop the failed disk, repair and add to the diskgroup and mount the disk group
SQL> alter diskgroup clusterdg drop disk CLUS1;
alter diskgroup clusterdg drop disk CLUS1
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15084: ASM disk "CLUS1" is offline and cannot be dropped.


SQL> alter diskgroup clusterdg drop disk CLUS1 force;

Diskgroup altered.

# /etc/init.d/oracleasm deletedisk clus1
Removing ASM disk "clus1": [ OK ]
# /etc/init.d/oracleasm createdisk clus1 /dev/sdc2
Marking disk "clus1" as an ASM disk: [ OK ]

alter diskgroup clusterdg add failgroup fail1 disk 'ORCL:CLUS1';
10. Verify the new disk being added as a vote disk. On the ASM alert log
Thu Aug 19 16:08:41 2010
ARB0 started with pid=26, OS id=7771
NOTE: assigning ARB0 to group 3/0xa62ef779 (CLUSTERDG)
NOTE: F1X0 copy 3 relocating from 65534:4294967294 to 3:2 for diskgroup 3 (CLUSTERDG)
NOTE: Voting file relocation is required in diskgroup CLUSTERDG
NOTE: Attempting voting file relocation on diskgroup CLUSTERDG
NOTE: voting file allocation on grp 3 disk CLUS1
Querying the vote disk
crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. OFFLINE 393aebdd039e4fdabf242bf461c45136 () []
2. ONLINE 04daf60741d34f9abf22e76fe22913b2 (ORCL:CLUS2) [CLUSTERDG]
3. ONLINE 2f38bb8318d34f1bbf49d28f6e400f60 (ORCL:CLUS3) [CLUSTERDG]
4. ONLINE 8a222576839d4f74bfa6f5949788c521 (ORCL:CLUS1) [CLUSTERDG]
The disk sequence numbers have changed to reflect the new order.

11. Move the vote disks to another diskgroup or block device and back to drop the offline vote disk (add and delete options on crsctl are not valid when vote disk is in ASM)
crsctl replace votedisk +DATA
Or
crsctl replace votedisk /dev/sdc6

Now formatting voting disk: /dev/sdc6.
CRS-4256: Updating the profile
Successful addition of voting disk 13f010bf72264f46bfb5aa1a50ce05a3.
Successful deletion of voting disk 393aebdd039e4fdabf242bf461c45136.
Successful deletion of voting disk 04daf60741d34f9abf22e76fe22913b2.
Successful deletion of voting disk 2f38bb8318d34f1bbf49d28f6e400f60.
Successful deletion of voting disk 8a222576839d4f74bfa6f5949788c521.
CRS-4256: Updating the profile
CRS-4266: Voting file(s) successfully replaced
[oracle@hpc1 ~ clusdb1]$ crsctl replace votedisk +CLUSTERDG
CRS-4256: Updating the profile
Successful addition of voting disk 1b6460e7b9394f96bf7cb3d04b9db909
Successful addition of voting disk 9bf04ee41bda4fd0bf90cee544a1c7f3
Successful addition of voting disk 3f77a85b0e274fb8bfb58f5190073867.
Successful deletion of voting disk 13f010bf72264f46bfb5aa1a50ce05a3.
Successfully replaced voting disk group with +CLUSTERDG.
CRS-4256: Updating the profile
CRS-4266: Voting file(s) successfully replaced

crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 1b6460e7b9394f96bf7cb3d04b9db909 (ORCL:CLUS2) [CLUSTERDG]
2. ONLINE 9bf04ee41bda4fd0bf90cee544a1c7f3 (ORCL:CLUS3) [CLUSTERDG]
3. ONLINE 3f77a85b0e274fb8bfb58f5190073867 (ORCL:CLUS1) [CLUSTERDG]
Useful Metalink note
How to restore ASM based OCR after complete loss of the CRS diskgroup on Linux/Unix systems [ID 1062983.1]