Tuesday, December 7, 2010

Deleting a 11gR1 RAC Node

To remove a node from a which is part of a RAC involves several phases. First phase is to remove instances (both database and ASM) and to stop nodeapps. Second phase is to remove the Oracle Database software and remove the nodeapps final phase is to remove the clusterware. These must be done in this order to cleanly remove a node from RAC.

A 11gR1 (11.1.0.7) RAC two two nodes (rac1, rac2) has been used here, and rac2 will be deleted.

0. Backup vote disk and ocr

1. Run dbca from a node that is not being deleted.(in this case rac1) Select instance management from the options and delete instance option and proceed to select the instance on the node that is to be deleted (database instance on rac2).

2. If ASM on that instance is no longer needed (no other instances are on that node) then remove the ASM instance on that node by running
srvctl stop asm -n rac2
srvctl remove asm -n rac2
3. Stop the nodeapps and remove the listener on the node to be deleted using netca.

This concludes the first phase.

4. Remove the Oracle Database software by running the following command on the node to be deleted
$ORACLE_HOME/oui/bin/runInstaller -deinstall -silent "REMOVE_HOMES={/opt/app/oracle/product/11.1.0/db_1}" -local
5. If database resources are running on the node to be deleted then relocate them to a surviving node.
crs_stat

NAME=ora.racdb.db
TYPE=application
TARGET=ONLINE
STATE=ONLINE on rac2
It maybe the case no resources are running on the node to be deleted, then relocating is not necessary. If there are resources then relocate with
crs_relocate ora.racdb.db
6. Remove nodeapps by running the following as root
srvctl remove nodeapps -n rac2
Please confirm that you intend to remove the node-level applications on node rac2 (y/[n]) y
7. Add the updated node list in the oracle inventory. Run this on any surviving node.
$ORACLE_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME CLUSTER_NODES=rac1
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 3906 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /opt/app/oracle/oraInventory
'UpdateNodeList' was successful.
This concludes the second phase.




8. Remove stored network inetrfaces by runnign the following any one of the remaining nodes. This step is unnecessary if you ran the Oracle Interface Configuration Tool (OIFCFG) with the -global option during the installation, such as with Oracle Universal Installer
oifcfg delif –node rac2
9. Identify and remove the ons remote ports, run the remove command on a node that is going to remain in the cluster.
$CRS_HOME/bin/onsctl ping
Number of onsconfiguration retrieved, numcfg = 2
onscfg[0]
{node = rac1.domain.net, port = 6200}
Adding remote host rac1.domain.net:6200
onscfg[1]
{node = rac2.domain.net, port = 6200}
Setting remote port from OCR repository to 6200
Adding remote host rac2.domain.net:6200
ons is not running ...                       

racgons remove_config rac2.domain.net:6200
racgons: Existing key value on rac2.domain.net = 6200.
racgons: rac2.domain.net:6200 removed from OCR.
10. On the node to be deleted, run the rootdelete.sh script as the root user from the CRS_HOME/install directory to disable the Oracle Clusterware applications and daemons running on the node. rootdelete.sh nosharedhome

Only run this command once and use the nosharedhome argument if you are using a local file system. The nosharedvar option assumes the OCR.LOC file is
not on a shared file system. The default for this command is sharedhome that prevents you from updating the permissions of local files such that they can be
removed by the oracle user. If the ocr.loc file is on a shared file system (default is nosharedvar), then run the CRS_HOME/install/rootdelete.sh remote sharedvar command.
rootdelete.sh nosharedhome
Getting local node name
NODE = rac2
Getting local node name
NODE = rac2
CRS-0210: Could not find resource 'ora.rac2.ons'.
CRS-0210: Could not find resource 'ora.rac2.vip'.
CRS-0210: Could not find resource 'ora.rac2.gsd'.
Stopping resources.
This could take several minutes.
Successfully stopped Oracle Clusterware resources
Stopping Cluster Synchronization Services.
Shutting down the Cluster Synchronization Services daemon.
Shutdown request successfully issued.
Waiting for Cluster Synchronization Services daemon to stop
Waiting for Cluster Synchronization Services daemon to stop
Cluster Synchronization Services daemon has stopped
Oracle CRS stack is not running.
Oracle CRS stack is down now.
Removing script for Oracle Cluster Ready services
Updating ocr file for downgrade
Cleaning up SCR settings in '/etc/oracle/scls_scr'
Cleaning up Network socket directories
11. To delete the node from the cluster as root from any surviving node run $CRS_HOME/install/rootdeletenode.sh. Identify the node name and number with olsnode. If this step is not performed the olsnodes command will continue to display the deleted node as a part of the cluster.
olsnodes -n
rac1    1
rac2    2

rootdeletenode.sh rac2,2
CRS-0210: Could not find resource 'ora.rac2.ons'.
CRS-0210: Could not find resource 'ora.rac2.vip'.
CRS-0210: Could not find resource 'ora.rac2.gsd'.
PRKO-2112 : Some or all node applications are not removed successfully on node: rac2
CRS-0210: Could not find resource 'ora.rac2.vip'.CRS-0210: Could not find resource 'ora.rac2.ons'.CRS-0210: Could not find resource 'ora.rac2.gsd'.
CRS nodeapps are deleted successfully
clscfg: EXISTING configuration version 4 detected.
clscfg: version 4 is 11 Release 1.
Value SYSTEM.crs.versions.rac2 marked for deletion is not there. Ignoring.
Successfully deleted 15 values from OCR.
Key SYSTEM.css.interfaces.noderac2 marked for deletion is not there. Ignoring.
Key SYSTEM.crs.versions.rac2 marked for deletion is not there. Ignoring.
Successfully deleted 13 keys from OCR.
Node deletion operation successful.
'rac2,2' deleted successfully
12. Remove the node from the inventory list by running the following on the node that's been deleted.
runInstaller -updateNodeList ORACLE_HOME=$CRS_HOME "CLUSTER_NODES={rac2}" CRS=TRUE -local -silent
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 4094 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /opt/app/oracle/oraInventory
13. Deinstall (in a non shared crs home) the cluster home by running the following on the node that's been deleted. THIS STEP COULD BE A DISASTER IF THE STEP 12 HASN'T REMOVED THE INVENTORY ENTRY OF OTHER NODES ON THE LOCAL INVENTORY FILE. IF IT HASN'T REMOVED THOSE NODES FROM THE INVENTORY OF THE LOCAL NODE WHICH IS BEING DELTED THEN THIS STEP WOULD WIPE OUT THE CLUSTERWARE HOME FILES ON ALL THE NODES IN THE CLUSTER. CLUSTER WILL AUTO SHUTDOWN AFTER SOMETIME AND NEW INSTALL OF CLSUTERWARE IS NEEDED TO RECOVER FROM THIS. COMMAND IS LISTED ON ORACLE DOCUMENTATION Step 8. Maybe the command should be of the form
runInstaller -deinstall -silent "REMOVE_HOMES={/opt/crs/oracle/product/11.1.0/crs}" -local
"-local" is missing on Oracle documentation.
It worked fine on two node cluster during test but failed and wiped out the clusterware files on two clusters with 3 or more nodes even after running the step 12 above ( which suceeded without error but inventory file wasn't checked explicity to see if other nodes were removed). Exact reason is not known yet blog will be updated as things progress.
Take EXTREME caution with this step when following Oracle document. This step could be skipped (and go to 14) and deletion of the crs home could be done manually using OS utility. Any residual files should be taken care of as well.
runInstaller -deinstall -silent "REMOVE_HOMES={/opt/crs/oracle/product/11.1.0/crs}" -local
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 4094 MB    Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2010-12-06_11-47-38AM. Please wait ...
Oracle Universal Installer, Version 11.1.0.7.0 Production
Copyright (C) 1999, 2008, Oracle. All rights reserved.

Starting deinstall

Deinstall in progress (Monday, December 6, 2010 11:47:47 AM GMT)
WARNING:The directory: /opt/crs/oracle/product/11.1.0/crs will be deleted after deinstall.
Click on "Yes" to continue.
Click on "No" to perform deinstall without deleting the directory.
Click on "Cancel" to go back to "Inventory Dialog".
............................................................... 100% Done.

Deinstall successful
Removing Cluster Oracle homes
SEVERE:Remote 'RemoveHome' failed on nodes: 'rac2'. Refer to '/opt/app/oracle/oraInventory/logs/installActions2010-12-06_11-47-38AM.log' for details.
You can manually re-run the following command on the failed nodes after the installation:
/opt/crs/oracle/product/11.1.0/crs/oui/bin/runInstaller -removeHome -noClusterEnabled ORACLE_HOME=/opt/crs/oracle/product/11.1.0/crs -cfs  LOCAL_NODE=.

End of install phases.(Monday, December 6, 2010 11:49:01 AM GMT)
End of deinstallations
Please check '/opt/app/oracle/oraInventory/logs/silentInstall2010-12-06_11-47-38AM.log' for more details
Inspite the error msg crs home is cleared, all daemons,inittab entries are removed.

14. On any remaining node run the following command to update the node list. Check the inventory.xml on all nodes aftewards to verify node has been removed from inventory.xml
runInstaller -updateNodeList ORACLE_HOME=$CRS_HOME "CLUSTER_NODES={rac1}" CRS=TRUE
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 3907 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /opt/app/oracle/oraInventory
'UpdateNodeList' was successful.
This concludes the final phase.

Related Post
Deleting a Node From 11gR2 RAC