Wednesday, December 22, 2010

EMD upload error: uploadXMLFiles skipped

There are situations where agent upload command gives the following error
emctl upload
Oracle Enterprise Manager 11g Release 1 Grid Control 11.1.0.1.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
EMD upload error: uploadXMLFiles skipped :: OMS version not checked yet. If this issue persists check trace files for ping to OMS
related errors.
To fix this issue
exec mgmt_admin.cleanup_agent('host:port'); 
host:port is the hostname and port shown in the agent status output
Agent Version     : 11.1.0.1.0
OMS Version       : 11.1.0.1.0
Protocol Version  : 11.1.0.0.0
Agent Home        : /opt/app/oracle/grid_agent/agent11g
Agent binaries    : /opt/app/oracle/grid_agent/agent11g
Agent Process ID  : 25049
Parent Process ID : 25027
Agent URL         : https://hpc1.domain.net:3872/emd/main/
Repository URL    : https://hpc4.domain.net:4900/em/upload
This command should be run on grid control database login as sysman. This will remove the remote host information from the grid control repository.

On the remote host itself run
rm -r /sysman/emd/state/*
rm -r /sysman/emd/collection/*
rm -r /sysman/emd/upload/*
rm /sysman/emd/lastupld.xml
rm /sysman/emd/agntstmp.txt
rm /sysman/emd/blackouts.xml
rm /sysman/emd/protocol.ini
Some files may not exists in the agent home.

Clear the agent state
$Agent_home/emctl clearstate agent
Secure the agent again
emctl secure agent
and start the agent
emctl start agent
Monitor new host being registered on the grid control target page. On the agent status output monitor the following lines for sucessful uploads
Last successful upload                     
Total Megabytes of XML files uploaded so far 
Number of XML files pending upload          
Size of XML files pending upload(MB)        
Last successful heartbeat to OMS


Removing Grid Control Agent from One Node Only in Silent Mode

This blog is for situation where grid control agent is removed from one node in the cluster.

1. Stop the agent on the remote host

2. Run the following on the node on which agent is being removed
/opt/app/oracle/grid_agent/agent11g/oui/bin/runInstaller -updateNodeList ORACLE_HOME=/opt/app/oracle/grid_agent/agent11g "CLUSTER_NODES={rac2}" -local
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB. Actual 4094 MB Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /opt/app/oracle/oraInventory
3. Deinstall agent on the node oracle doc says to use -forceDeinstall but installer complains about this.
oui/bin/runInstaller -silent  "REMOVE_HOMES={/opt/app/oracle/grid_agent/agent11g}" -deinstall -waitForCompletion -removeallfiles -local -forceDeinstall
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB. Actual 4094 MB Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2010-12-15_12-28-38PM. Please wait ...Oracle Universal Installer, Version 11.1.0.8.0 Production
Copyright (C) 1999, 2010, Oracle. All rights reserved.

The command line arguments '-forceDeinstall' are not valid options. Type 'runInstaller -help' at the command line for instructions on appropriate command line usage.
SEVERE:The command line arguments '-forceDeinstall' are not valid options. Type 'runInstaller -help' at the command line for instructions on appropriate command line usage.
Run without it and command execute without an error
oui/bin/runInstaller -silent  "REMOVE_HOMES={/opt/app/oracle/grid_agent/agent11g}" -deinstall -waitForCompletion -removeallfiles -local
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB. Actual 4094 MB Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2010-12-15_12-29-00PM. Please wait ...Oracle Universal Installer, Version 11.1.0.8.0 Production
Copyright (C) 1999, 2010, Oracle. All rights reserved.

Starting deinstall

Deinstall in progress (Wednesday, December 15, 2010 12:29:05 PM GMT)
Configuration assistant "Agent Deinstall Assistant" succeeded
Configuration assistant "Oracle Configuration Manager Deinstall" succeeded
............................................................... 100% Done.

Deinstall successful

End of install phases.(Wednesday, December 15, 2010 12:30:20 PM GMT)
End of deinstallations
Please check '/opt/app/oracle/oraInventory/logs/silentInstall2010-12-15_12-29-00PM.log' for more details.
4. On each of the remaining nodes run installer with remaining node list.
./runInstaller -updateNodeList ORACLE_HOME=/opt/app/oracle/grid_agent/agent11g "CLUSTER_NODES={rac1}"
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB. Actual 4094 MB Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /opt/app/oracle/oraInventory
'UpdateNodeList' was successful.
5. Remove all targets on that node from the grid control target page.

Tuesday, December 21, 2010

Recover From a Clusterware Home Deletion

This blog is as a result of the outcome seen twice on node deletion step 14.
It is assumed that all the clusterware files inside the CRS_HOME has been deleted and at best only the directories are available without any files in them.

In this case a new clusterare installation is needed unless those clusterware homes (each home individually) are backed up using some OS utility (this option hasn't been tested).

Even though clusterware is newly installed oracle home and asm home is intact and are cluster aware. So what is required is to add the existing homes, database, instances and etc to the new clusterware and ocr to be managed by the cluster.

1. Remove startup commands from /etc/inittab

2. Remove symbolic links and init scripts from init.d and rc#.d
cd /etc/rc.d
for i in `find . -name S96*`; do rm -f $i; done
for i in `find . -name K19*`; do rm -f $i; done
3. Remove directory with ocr location and scr settings.
rm -rf /etc/oracle
4. Change permission on clusterware files (ocr, vote) location
chown oracle:oinstall /dev/raw/raw1
chown oracle:oinstall /dev/raw/raw2
5. Zero out the block or character devices used for ocr and vote disk storage
dd if=/dev/zero of=/dev/raw/raw1 bs=8192 count=1024
dd if=/dev/zero of=/dev/raw/raw2 bs=8192 count=1024
6. Backup the current inventory folder and create a new empty inventory folder (oraInventory) change the permission on it to 770 and oracle:oinstall

7. Remove the crs home and create a new directory with ownership oracle:oinstall. Execute runInstaller for the clusterware. If clusterware has been upgraded, started with 10.2 and was upgraded to 11.1 when files got deleted, then start from the initial version if ocr and vote disk location doesn't have enough space to accomodate the new version and upgrade afterwards to match the version before the files got deleted. Once the installation is complete attache the other oracle homes (ORACLE_HOME, ASM_HOME) to new inventory.
./runInstaller -silent -attachHome ORACLE_HOME=$ORACLE_HOME ORACLE_HOME_NAME="oracle_home" "CLUSTER_NODES={rac1,rac2}"
Once the installation is completed add listener, asm, database and instances.

8. Add listener
srvctl add listener -n rac1 -o $ORACLE_HOME   
srvctl add listener -n rac2 -o $ORACLE_HOME
srvctl start listener -n rac1
srvctl start listener -n rac2
9. Add asm
srvctl add asm -n rac1 -i +ASM1 -o $ORACLE_HOME
srvctl add asm -n rac2 -i +ASM2 -o $ORACLE_HOME
srvctl start asm -n rac1
srvctl start asm -n rac2
10. Add database
srvctl add database -d racdb -o $ORACLE_HOME -p '+DATA/racdb/spfileracdb.ora'
srvctl modify database -d racdb -o $ORACLE_HOME -p '+DATA/racdb/spfileracdb.ora' -m domain.net
11. Add instance
srvctl add instance -d racdb -i racdb1 -n rac1
srvctl modify instance -d racdb -i racdb1 -s +ASM1

srvctl add instance -d racdb -i racdb2 -n rac2
srvctl modify instance -d racdb -i racdb2 -s +ASM2
12. Start the database
srvctl start database -d racdb


Sunday, December 19, 2010

ASM Disk Creation Fails with EMC PowerPath

Creating an ASM disk with oracleasm createdisk fails with
Marking disk "/dev/emcpowera1" as an ASM disk: asmtool: Device "/dev/emcpowera1" is not a partition [FAILED]
even though multipathing is setup correctly.

According to metalink note 469163.1 ASMLib: oracleasm createdisk command fails: Device '/dev/emcpowera1 is not a partition this is due to the PowerPath version is less than 5.3.0.

In spite this error ASM disks could be created using asmtool by specifying the force option.
# /usr/sbin/asmtool -C -l /dev/oracleasm -n DATA01 -s /dev/emcpowera1 -a force=yes
Once created the disk will be visible through listdisks command and running a scandisks on other nodes in the cluster will make it discoverable on these nodes as well.

Another related metalink note is 566676.1 How To Determinate If An EMCPOWER Partition Is Valid For ASMLIB? which list a different cause for the problem but error message shown is same as above.

Diskmon related CRS startup Issue

Running root.sh on after a clusterware upgrade does not start all the cluster service but time out at EVMD. Looking at the ocssd.log following could be observed
[    CSSD]2010-12-17 11:14:39.774 [1223936320] >TRACE:   clssgmExecuteClientRequest: GRKJOIN recvd from client 60 (0x2aaaac1ddbd0)
[ CSSD]2010-12-17 11:14:39.774 [1223936320] >TRACE: clssgmJoinGrock: global grock crs_version new client 0x2aaaac1ddbd0 with con 0x2aaaac1dd8d0, requested num -1
[ CSSD]2010-12-17 11:14:39.774 [1223936320] >TRACE: clssgmJoinGrock: ignoring grock join before fatal for grock (-1/0x800/crs_version)
[ CSSD]2010-12-17 11:14:40.019 [1223936320] >TRACE: clssgmRegisterClient: proc(3/0x2aaaac1d88e0), client(57/0x2aaaac1ddfc0)
[ CSSD]2010-12-17 11:14:40.019 [1223936320] >TRACE: clssgmExecuteClientRequest: GRKJOIN recvd from client 57 (0x2aaaac1ddfc0)
[ CSSD]2010-12-17 11:14:40.019 [1223936320] >TRACE: clssgmJoinGrock: global grock EVMDMAIN new client 0x2aaaac1ddfc0 with con 0x2aaaac1ddcc0, requested num 1
[ CSSD]2010-12-17 11:14:40.019 [1223936320] >TRACE: clssgmJoinGrock: ignoring grock join before fatal for grock (1/0x200/EVMDMAIN)
[ CSSD]2010-12-17 11:14:40.073 [1129527616] >TRACE: clssgmDispatchCMXMSG: msg type(5) src(2) dest(65535) size(420) tag(00000000) incarnation(187985787)
[ CSSD]2010-12-17 11:14:40.073 [1129527616] >TRACE: clssgmHandleJoinUpdate: (src 2/2) grock SRVM.DATABASE.NODEAPPS.racb01, gid 890, gin 1, member 0
[ CSSD]2010-12-17 11:14:40.073 [1129527616] >TRACE: clssgmAddMember: granted member(0) flags(0x1001) node(2) grock (0x15d08ac0/SRVM.DATABASE.NODEAPPS.racb01)
[ CSSD]2010-12-17 11:14:40.073 [1129527616] >TRACE: clssgmCommonAddMember: global lock grock SRVM.DATABASE.NODEAPPS.racb01 member(0/Remote) node(2) flags 0x1001 0x1000001
[ CSSD]2010-12-17 11:14:40.073 [1129527616] >TRACE: clssgmDispatchCMXMSG: msg type(5) src(2) dest(65535) size(420) tag(00000000) incarnation(187985787)
[ CSSD]2010-12-17 11:14:40.073 [1129527616] >TRACE: clssgmHandleJoinUpdate: (src 2/2) grock SRVM.DATABASE.NODEAPPS.racb04, gid 891, gin 1, member 0
[ CSSD]2010-12-17 11:14:40.073 [1129527616] >TRACE: clssgmAddMember: granted member(0) flags(0x1001) node(2) grock (0x15d09050/SRVM.DATABASE.NODEAPPS.racb04)
[ CSSD]2010-12-17 11:14:40.073 [1129527616] >TRACE: clssgmCommonAddMember: global lock grock SRVM.DATABASE.NODEAPPS.racb04 member(0/Remote) node(2) flags 0x1001 0x1000001
[ CSSD]2010-12-17 11:14:40.077 [1129527616] >TRACE: clssgmDispatchCMXMSG: msg type(6) src(2) dest(65535) size(352) tag(00000000) incarnation(187985787)
[ CSSD]2010-12-17 11:14:40.077 [1129527616] >TRACE: clssgmHandleExitUpdate: (src 2) grock SRVM.DATABASE.NODEAPPS.racb01, member 0
[ CSSD]2010-12-17 11:14:40.077 [1129527616] >TRACE: clssgmRemoveMember: grock SRVM.DATABASE.NODEAPPS.racb01, member number 0 (0x15d08e90) node number 2 state 0x0 member refcnt 1 grock type 3
[ CSSD]2010-12-17 11:14:40.077 [1129527616] >TRACE: clssgmDispatchCMXMSG: msg type(6) src(2) dest(65535) size(352) tag(00000000) incarnation(187985787)
[ CSSD]2010-12-17 11:14:40.077 [1129527616] >TRACE: clssgmHandleExitUpdate: (src 2) grock SRVM.DATABASE.NODEAPPS.racb04, member 0
[ CSSD]2010-12-17 11:14:40.077 [1129527616] >TRACE: clssgmRemoveMember: grock SRVM.DATABASE.NODEAPPS.racb04, member number 0 (0x15d09420) node number 2 state 0x0 member refcnt 1 grock type 3
[ CSSD]2010-12-17 11:14:40.533 [1265895744] >TRACE: kgzf_dskm_conn4: unable to connect to master diskmon in 60120 msec
[ CSSD]2010-12-17 11:14:40.533 [1265895744] >TRACE: kgzf_send_main1: connection to master diskmon timed out
[ CSSD]2010-12-17 11:14:40.534 [1286875456] >TRACE: KGZF: Fatal diskmon condition, IO fencing is not available. For additional error info look at the master diskmon log file (diskmon.log)

[ CSSD]2010-12-17 11:14:40.534 [1286875456] >ERROR: ASSERT clsssc.c 2471
[ CSSD]2010-12-17 11:14:40.534 [1286875456] >ERROR: clssscSAGEInitFenceCompl: Fence completion failed, rc 56859
[ CSSD]2010-12-17 11:14:40.534 [1223936320] >TRACE: clssgmClientShutdown: Aborting client (0x2aaaac1d3e60) proc (0x2aaaac1d3390)
[ CSSD]2010-12-17 11:14:40.534 [1223936320] >TRACE: clssgmClientShutdown: waited 0 seconds on 1 IO capable clients
[ CSSD]2010-12-17 11:14:40.534 [1223936320] >TRACE: clssgmClientShutdown: Waiting for I/O capable proc (0x2aaaac1d3390), pid (7826)
[ CSSD]2010-12-17 11:14:40.636 [1223936320] >TRACE: clssgmClientShutdown: Waiting for I/O capable proc (0x2aaaac1d3390), pid (7826)
[ CSSD]2010-12-17 11:14:40.637 [1223936320] >TRACE: clssgmDeadProc: proc 0x2aaaac1d3390
[ CSSD]2010-12-17 11:14:40.637 [1223936320] >TRACE: clssgmFenceClient: fencing client (0x2aaaac1d3e60), member 1 in group #CSS_CLSSOMON, no share, death 1, SAGE 0
[ CSSD]2010-12-17 11:14:40.637 [1223936320] >TRACE: clssgmUnreferenceMember: global grock #CSS_CLSSOMON member 1 refcount is 1
[ CSSD]2010-12-17 11:14:40.637 [1223936320] >TRACE: clssgmFenceProcessDeath: client (0x2aaaac1d3e60) pid 7826 undead
[ CSSD]2010-12-17 11:14:40.637 [1223936320] >TRACE: clssgmQueueFenceForCheck: (0x2aaaac1dd750) Death check for object type 3, pid 7826
[ CSSD]2010-12-17 11:14:40.637 [1223936320] >TRACE: clssgmDestroyProc: cleaning up proc(0x2aaaac1d3390) con(0x2aaaac1ca990) skgpid ospid 7826 with 0 clients, refcount 0
[ CSSD]2010-12-17 11:14:40.637 [1213446464] >TRACE: clssgmFenceCompletion: (0x2aaaac1dd750) process death fence completed for process 7826, object type 3
[ CSSD]2010-12-17 11:14:40.637 [1213446464] >TRACE: clssgmTermMember: Terminating member 1 (0x2aaaac1d39e0) in grock #CSS_CLSSOMON
[ CSSD]2010-12-17 11:14:40.637 [1213446464] >TRACE: clssgmAllocateRPCIndex: allocated rpc 1 (0x2aaaaad00090)
[ CSSD]2010-12-17 11:14:40.637 [1223936320] >WARNING: clssgmClientShutdown: graceful shutdown completed.
[ CSSD]2010-12-17 11:14:40.637 [1213446464] >TRACE: clssgmRPC: rpc 0x2aaaaad00090 (RPC#1) tag(1002a) sent to node 2
[ CSSD]2010-12-17 11:14:40.638 [1129527616] >TRACE: clssgmDispatchCMXMSG: msg type(6) src(2) dest(65535) size(352) tag(0001002a) incarnation(187985787)
[ CSSD]2010-12-17 11:14:40.638 [1129527616] >TRACE: clssgmHandleExitUpdate: (src 2) grock #CSS_CLSSOMON, member 1
[ CSSD]2010-12-17 11:14:40.638 [1129527616] >TRACE: clssgmRPCDone: rpc 0x2aaaaad00090 (RPC#1) state 6, flags 0x100
[ CSSD]2010-12-17 11:14:40.638 [1140017472] >TRACE: clssnmvDoWork: type 7 for disk 0
[ CSSD]2010-12-17 11:14:40.638 [1129527616] >TRACE: clssgmDelMemCmpl: rpc 0x2aaaaad00090, ret 0, client (nil) member 0x2aaaac1d39e0
[ CSSD]2010-12-17 11:14:40.638 [1129527616] >TRACE: clssgmFreeRPCIndex: freeing rpc 1
[ CSSD]2010-12-17 11:14:40.638 [1129527616] >TRACE: clssgmRemoveMember: grock #CSS_CLSSOMON, member number 1 (0x2aaaac1d39e0) node number 1 state 0x10 member refcnt 0 grock type 2
[ CSSD]2010-12-17 11:14:40.638 [1223936320] >TRACE: clssnmSendManualShut: Notifying all nodes that this node has been manually shut down
Looking in the diskmon.log shows
[ DISKMON]

I/O Fencing and SKGXP HA monitoring daemon -- Version 1.0.0.0
Process 20415 started on 12/17/2010 at 11:14:41.125

[ DISKMON] 12/17/2010 11:14:41.130 dskm main: starting up
[ DISKMON] 12/17/2010 11:14:41.131 [20415:1122142528] dskm_rac_thrd_main: running
[ DISKMON] 12/17/2010 11:14:41.131 [20415:1898995808] dskm_rac_thrd_creat2: got the post from the css event handling thread
[ DISKMON] 12/17/2010 11:14:41.131 [20415:1898995808] dskm main10: skgznp_rm_pipe failed with error 56825
[ DISKMON] 12/17/2010 11:14:41.131 [20415:1898995808] dskm_main10: error 56825 at location skgznprmpipe - unlink() - Operation not permitted
[ DISKMON] 12/17/2010 11:14:41.131 [20415:1898995808] dskm main11: skgznp_create(default pipe) failed with error 56810
[ DISKMON] 12/17/2010 11:14:41.131 [20415:1898995808] dskm_main11: error 56810 at location skgznpcre3 - bind() - Address already in use
[ DISKMON]
Process 20415 exiting on 12/17/2010 at 11:14:41.131
Look in /tmp directory for a file named .oracle_master_diskmon and when this error occurs it's ownership is set to owner root and group root. To fix this issue change the ownership to oracle:oinstall as per metalink note 870832.1 CRS Startup Issues Due to DISKMON.

Removing the /tmp/.oracle_master_diskmon file and rebooting the node also works.

ORA-600 [kccsbck_first]

Starting an oracle instance in an cluster could throw up the following error.
SQL> startup
ORACLE instance started.

Total System Global Area 6413680640 bytes
Fixed Size 2171672 bytes
Variable Size 1107299560 bytes
Database Buffers 5284823040 bytes
Redo Buffers 19386368 bytes
ORA-00600: internal error code, arguments: [kccsbck_first], [1], [3224794587],
[], [], [], [], [], [], [], [], []
On Linux x86_64 restarting the cluster would resolve this error.

More information on metalink note 157536.1 ORA-600 [kccsbck_first] - What to Check

Wednesday, December 15, 2010

Installing Grid Control 11gR1 and Deploying Agents

Installing Enterprise Manager Grid Control has three main parts.
1. Installing Weblogic for the webtier of the grid control
2. Installing the database
3. Finally the installing of actual grid control software

Some of the information is directly copied from the Enterprise Manager Grid Control Basic Installation Guide and shown in italic

1. Install Java before installing the weblogic. For Linux Platform (32/64 Bit)JDK version must be SUN JDK 1.6_18 or higher. JRockit is not supported.

2. Install Weblogic. For 64bit version download the generic version of the installation, which is a jar file (eg. wls1032_generic.jar)

There are serveral requirments before installing weblogic.

Ensure that Oracle WebLogic Server 10.3.2 (Oracle Fusion Middleware 11g Release 1 Patch Set 1) is already installed on the host where you plan to install Enterprise Manager Grid Control.

Ensure that the installation was a typical installation, and even if it was a custom installation, ensure that components chosen for custom installation were the same as the ones associated with a typical installation.

Ensure that the installation was under the Middleware Home directory. For example, /opt/app/oracle/Middleware/wlserver_10.3

Ensure that no other Oracle Fusion Middleware products or components are installed in the Middleware Home directory where Oracle WebLogic Server 10.3.2 is installed.


3. Patch weblogic with patch ID WDJ7. This patch fixes bugs 8990616, 9100465, and 9221722. For information on applying this patch, see My Oracle Support note 1072763.1 How to Download and Apply the Recommended WLS patch WDJ7 on WLS for 11g Grid Control Installation or Upgrade
Oracle Smart Update utility must be used for this, which could be run with Middleware_Home/utils/bsu/bsu.sh, it requires metalink login download the patch.

Select the patch and downloadOnce the patch is downloaded it will be available on the downloaded sectionClick Manage Patches tab and apply the patchFurther information on patches and activation and weblogic documentation

4. Install the database software and create the database. The database software could be 11.1 or 11.2 but requires some patching.

Ensure that the existing, certified Oracle Database is one of the databases listed in My Oracle Support note 412431.1. The database can reside either on the host where you are installing the product or on a remote host.

(Optional) If you are installing using Oracle Database 11g Release 1 (11.1.0.7.0), then ensure that you apply the patch for bug# 9066130.
This patch only available for 11.1.0.7.1 anything higher would have the patch on the psu so no need to apply separately.

(Optional) If you are installing using Oracle Database 11g Release 2 (11.2.0.1.0), then ensure that you apply the patch for bug# 9002336 and 9067282.

Ensure that existing, certified Oracle Database is not in QUIESCE mode.

Ensure that your existing, certified Oracle Database does NOT have Database Control SYSMAN schema. If it has, that is, if your existing database is configured with Database Control, then deconfigure it.

Best option create the database using dbca without selecting enterprise manager option. Still sysman schema will be there but the account is expired and lock. Drop the sysman schema and em with
emca -deconfig dbcontrol db -repos drop -SYS_PWD gridb -SYSMAN_PWD gridb
If a sysman scheam is there installation of grid control will fail.

Ensure that the fine-grained access control option is set to TRUE in the existing, certified Oracle Database so that the Management Repository can be created. To verify this, run the following command:
select value from v$option where parameter = 'Fine-grained access control';
Make the temporary tablespace and undo tablespace auto extensible.

If the database is not in archive log mode change to archive log mode

5. Execute the runInstaller for grid control software. Create the oms directory under the middleware directory (eg. /opt/app/oracle/Middleware/oms11g) Post installation phaseFinish page will show the oms url and the admin url
6. After the installation ends successfully, the OMS and the Management Agent start automatically these could be checked with
oms status check
/opt/app/oracle/Middleware/oms11g/bin/emctl status oms
Oracle Enterprise Manager 11g Release 1 Grid Control
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
WebTier is Up
Oracle Management Server is Up
agent status check
/opt/app/oracle/Middleware/agent11g/bin/emctl status agent
Oracle Enterprise Manager 11g Release 1 Grid Control 11.1.0.1.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
Agent Version     : 11.1.0.1.0
OMS Version       : 11.1.0.1.0
Protocol Version  : 11.1.0.0.0
Agent Home        : /opt/app/oracle/Middleware/agent11g
Agent binaries    : /opt/app/oracle/Middleware/agent11g
Agent Process ID  : 21415
Parent Process ID : 21389
Agent URL         : https://hpc4.domain.net:3872/emd/main/
Repository URL    : https://hpc4.domain:4900/em/upload
Started at        : 2010-12-07 17:37:03
Started by user   : oracle
Last Reload       : 2010-12-07 17:37:25
Last successful upload                       : 2010-12-08 10:23:24
Total Megabytes of XML files uploaded so far :    88.26
Number of XML files pending upload           :        0
Size of XML files pending upload(MB)         :     0.00
Available disk space on upload filesystem    :    52.55%
Last successful heartbeat to OMS             : 2010-12-08 10:26:51
---------------------------------------------------------------
Agent is Running and Ready
agent upload check
/opt/app/oracle/Middleware/agent11g/bin/emctl upload
Oracle Enterprise Manager 11g Release 1 Grid Control 11.1.0.1.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
EMD upload completed successfully
7. Login to the grid control as sysman and configure the Oracle Database and Oracle Automatic Storage Management (Oracle ASM) target for monitoring.

In the Enterprise Manager Grid Control console, when you view the Home page for an Oracle Database target for the first time, the Database Home page may not display any monitoring data, and the status of the database may indicate that there is a metric collection error. This is because the DBSNMP credentials has not been configured, or has been locked due to unsuccessful login attempts.

Similarly, when you view the Home page of an Oracle Automatic Storage Management (Oracle ASM) target for the first time, the status of the Oracle ASM instance may be unknown or unavailable, and the Home page may indicate that the Management Agent is unavailable (down). Again, this is because you have not specified the ASM SYS credentials.

To fix this problem for an Oracle Database target, If your DBSNMP user account is locked, then to unlock it follow these steps:

In Grid Control, click Targets and then Databases.
On the Databases page, from the table that lists all databases, click a database name.
On the Database Home page, click the Server tab.
On the Server page, from the Security section, click Users. If you are prompted to log in to the database, make sure to use a database user account with DBA privileges such as SYSTEM.
On the Users page, find and select the DBSNMP user account. From the Actions list, select Unlock User, and click Go. If you are asked to confirm whether you want to unlock the DBSNMP user account, click Yes.


8. For accessing the Enterprise Manager Grid Control console, ensure that you use only certified browsers as mentioned in My Oracle Support note 412431.1

9. Adding agent to monitor remote targets. Several pre-reqs must be satistifed before agents could be deployed to remote hosts.
If you want to view the status of an installation session that was previously run, then click Agent Installation Status on the Deployments page. However, do not attempt to view the installation status until the installation is complete. If you do, you will see an error.

Ensure that you install the Management Agent only on certified operating systems as mentioned in My Oracle Support note 412431.1.

Ensure that you allocate 100 MB of space for the central inventory directory.

Also ensure that the central inventory directory is not on a shared file system. If it is already on a shared file system, then switch over to a non-shared file system by following the instructions outlined in My Oracle Support note 1092645.1

Ensure that the installation base directory you specify is empty and has write permission.

If you want to install Oracle Management Agent 10g, then ensure that you download that Management Agent software.

If you want to install Oracle Management Agent 11g Release 1 for an operating system that is different from the one on which the Oracle Management Service 11g Release 1 is running, then ensure that you download the Management Agent software for that operating system.



Validate the path to all command locations in oms11g/sysman/agent_download/11.1.0.1.0/linux_x64/agentdeploy (Paths.properties,sPaths.properties,ssPaths_.properties,userPaths.properties).
On linux 64 sudo path is /usr/bin/sudo but it is listed on ssPaths_*** file as /usr/local/bin/sudo. change this.

Ensure that the host names and the IP addresses are properly configured in the /etc/hosts file. For example, for installing a Management Agent on a host, ensure that the host name specified in the /etc/hosts file is unique, and ensure that it maps to the correct IP address of that host. Otherwise, the installation can fail on the product-specific prerequisite check page.The installation uses hostnames instead of IPs even if IPs are specified during the setup. Therefore it's better to have hostnames of the remote hosts in /etc/hosts on the gird control server.

If the destination host and the host on which OMS is running belong to different network domains, then ensure that you update the /etc/hosts file on the destination host to add a line with the IP address of that host, the fully-qualified name of that host, and the short name of the host.

If the central inventory owner and the user installing the Management Agent are different, then ensure that they are part of the same group.
Also ensure that the inventory owner and the group to which the owner belongs have read and write permissions on the inventory directory.
For example, if the inventory owner is abc and user installing the Management Agent is xyz, then ensure that abc and xyz belong to the same group, and they have read and write access to the inventory.

Ensure that you have read, write, and execute permissions on oraInventory on all remote hosts. If you do not have these permissions on the default inventory (typically at /etc/oraInst.loc) on any remote host, then you can specify the path to an alternative inventory location by using one of the following options in the Additional Parameters section of the Agent Deployment Wizard:
On the Installation Details page, if you select Default, from Management Server location, which means the software is on the OMS host, then use the -i option.
On the Installation Details page, if you select Another Location, which means the software is in any other location other than the OMS host, then use the -invPtrLoc option.

Ensure that the SSH daemon is running on the default port (that is, 22) on all the destination hosts. If the port is a non-default port, that is, any port other than 22, then update the SSH_PORT property in the following file that is present in the OMS Instance Base:
/sysman/prov/resources/Paths.properties

Ensure that the PubkeyAuthentication parameter is enabled in the sshd_config file.
To verify the value of this parameter, run the following command:
grep PubkeyAuthentication /sshd_config.
Default is yes and above command will show yes but the line is commented, if it's not explicitly set to no then no additional work is required.

Add sudo privilege on the remote host to oracle this could be removed once the agent is installed. This would make the installation run smoother and allow the running of the root.sh when agents are deployed to remote hosts. add the following line to /etc/sudoers
oracle  ALL=(ALL)       NOPASSWD:ALL
and comment out the line
Defaults    requiretty
For any problems refer metalink note 363509.1 Problem - Agent Installation Using Push Method Fails with Error 'User not enabled for sudo' when sudo has been configured correctly

From the host where grid control is installed it should be possible to ping the host(s) (cluster nodes etc) and get a response. If ping response is blocked (ie firewall) then agent deployment will fail with a message remote host is down.

10. To install agent from the grid control hom page select deployment tab -> install agent -> fresh install. New agent deployment page will open.
Select the agent version and platform, enter the user details, root.sh could be run after the instllation before the post install scripts if sudo is not setup. To run root.sh the user entered must have sudo privilege.

For a agent deployment onto a RAC specify all the nodes that are part of the cluster or subset of nodes if agents are not deployed on some nodes.

Once installation is completed agent status on the remote host would be as
emctl status agent
Oracle Enterprise Manager 11g Release 1 Grid Control 11.1.0.1.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
Agent Version     : 11.1.0.1.0
OMS Version       : 11.1.0.1.0
Protocol Version  : 11.1.0.0.0
Agent Home        : /opt/app/oracle/grid_agent/agent11g
Agent binaries    : /opt/app/oracle/grid_agent/agent11g
Agent Process ID  : 22954
Parent Process ID : 22917
Agent URL         : http://hpc1.domain.net:3872/emd/main/
Repository URL    : HTTPS://hpc4.domain.net:4900/em/upload/
Started at        : 2010-12-08 12:37:27
Started by user   : oracle
Last Reload       : 2010-12-08 12:37:27
Last successful upload                       : (none)
Last attempted upload                        : (none)
Total Megabytes of XML files uploaded so far :     0.00
Number of XML files pending upload           :       23
Size of XML files pending upload(MB)         :    11.19
Available disk space on upload filesystem    :     9.98%
Last attempted heartbeat to OMS              : 2010-12-08 12:40:59
Last successful heartbeat to OMS             : unknown
---------------------------------------------------------------
Agent is Running and Ready
Secure the agent with
emctl secure agent
Oracle Enterprise Manager 11g Release 1 Grid Control 11.1.0.1.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
Agent successfully stopped...   Done.
Securing agent...   Started.
Enter Agent Registration Password :
Agent successfully restarted...   Done.
Securing agent...   Successful.
The password is the agent registration password used during the installation of grid software. after the securing is done agent communication will begin.
emctl status agent
Oracle Enterprise Manager 11g Release 1 Grid Control 11.1.0.1.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
Agent Version     : 11.1.0.1.0
OMS Version       : 11.1.0.1.0
Protocol Version  : 11.1.0.0.0
Agent Home        : /opt/app/oracle/grid_agent/agent11g
Agent binaries    : /opt/app/oracle/grid_agent/agent11g
Agent Process ID  : 25049
Parent Process ID : 25027
Agent URL         : https://hpc1.domain.net:3872/emd/main/
Repository URL    : https://hpc4.domain.net:4900/em/upload
Started at        : 2010-12-08 12:42:43
Started by user   : oracle
Last Reload       : 2010-12-08 12:42:43
Last successful upload                       : 2010-12-08 12:42:49
Total Megabytes of XML files uploaded so far :     6.02
Number of XML files pending upload           :       40
Size of XML files pending upload(MB)         :    29.29
Available disk space on upload filesystem    :     9.95%
Last successful heartbeat to OMS             : 2010-12-08 12:42:45
---------------------------------------------------------------
Agent is Running and Ready
When agent communcation begin the target database will auto appear on targe page (page refresh might be required)

11. Add the target database by enabling dbsnmp
Set the monitoring credentials for a standalone Oracle Database or Oracle RAC database. To do so, follow these steps:
In Grid Control, click Targets and then Databases.
On the Databases page, find and select the database target and click Monitoring Configuration.
On the Properties page, specify the password for the DBSNMP user in the Monitor Password field. To verify the monitoring credentials, click Test Connection.
If the connection is successful, click Next, then click Submit.


For ASM
Set the monitoring credentials for Oracle ASM. To do so, follow these steps:
In Grid Control, click Targets and then Databases.
On the Databases page, find and select the Oracle ASM target and click Monitoring Configuration.
On the Properties page, specify the password for the ASMSYS user in the Password field. To verify the monitoring credentials, click Test Connection.
If the connection is successful, click Next, then click Submit.


In cluster database the domain name is added to short vip names, which may not be in the /etc/hosts, in this case cluster db status may not get updated. Edit the listener host to reflect the real vip host name.

12. Stop and Start of Grid Control. The stop sequence stop agent, oms and db. The start sequence is start db, oms and agent.

stop agent
/opt/app/oracle/Middleware/agent11g/bin/emctl stop agent
Oracle Enterprise Manager 11g Release 1 Grid Control 11.1.0.1.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
Stopping agent ... stopped.
stop oms
/opt/app/oracle/Middleware/oms11g/bin/emctl stop oms -all
Oracle Enterprise Manager 11g Release 1 Grid Control
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
Stopping WebTier...
WebTier Successfully Stopped
Stopping Oracle Management Server...
Oracle Management Server Successfully Stopped
Oracle Management Server is Down
Then stop the the database.

Starting oms, start database if not started
/opt/app/oracle/Middleware/oms11g/bin/emctl start oms
Oracle Enterprise Manager 11g Release 1 Grid Control
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
Starting WebTier...
WebTier Successfully Started
Starting Oracle Management Server...
Oracle Management Server Successfully Started
Oracle Management Server is Up
start agent
/opt/app/oracle/Middleware/agent11g/bin/emctl start agent
Oracle Enterprise Manager 11g Release 1 Grid Control 11.1.0.1.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
Starting agent ...... started.
Some useful metalink notes

How to Install Web Logic Server 10.3.2 for Installing 11g Grid Control [ID 1063762.1]
Step by Step Installation of Enterprise Manager Grid Control 11.1.0.1 [ID 1059516.1]
Grid Control 11g: Example for Installing WebLogic Server 10.3.2 on OEL 5.3 x86_64 [ID 1063112.1]
Case Study - Installing Grid Control 11.1.0.1 - Installation of jdk1.6 0n Linux x86_64 Before Installing WebLogic Server 10.3.2 [ID 1063587.1]
Grid Control 11g: How to Install 11.1.0.1.0 on OEL5.3 x86_64 with a 11.1.0.7.0 Repository Database [ID 1064495.1]
Required External Components and Versions for the Grid Control 11.1.0.1.0 Installation [ID 1106105.1]
Master Note for Grid Control 11.1.0.1.0 Installation and Upgrade [ID 1067438.1]
Required Patches for Grid Control 11g (11.1.0.1.0) [ID 1101208.1]
How to Determine the List of Patch Set Update(PSU) Applied to the Enterprise Manager OMS and Agent Oracle Homes? [ID 1358092.1]
Oracle Enterprise Manager Grid Control Certification Checker [ID 412431.1]

Tuesday, December 7, 2010

Adding a Node to 11gR1 RAC

This post only focus on adding the Oracle components into the new node. It is assumed other pre-reqs such as installing Oracle, configuring NIC, creating Oracle user and establishing user equivalence is done. Similar to deleting a node adding a node is also done in phases. First phase includes adding clusterware, and in second phase Oracle Database Software is added and finally the database and ASM instances are extended to the new node.

Both silent and interactive options could be used for this. Here for most of the commands silent option has been selected.

0. Backup vote disk and ocr

1. To add the clusterware to new node, run addNode.sh on an existing node. In the interactive mode this would open a GUI similar to below were public, vip and private interconnect information is entered. Same could be achieved running the addNode.sh in silent mode as below
$CRS_HOME/oui/bin/addNode.sh -silent "CLUSTER_NEW_NODES={rac2}" "CLUSTER_NEW_PRIVATE_NODE_NAMES={rac2-pvt}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={rac2-vip}"
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB. Actual 4094 MB Passed
Oracle Universal Installer, Version 11.1.0.7.0 Production
Copyright (C) 1999, 2008, Oracle. All rights reserved.


Performing tests to see whether nodes rac2 are available
......................................................... 100% Done.

-------------------------------------------------------------------
Cluster Node Addition Summary
Global Settings
Source: /opt/crs/oracle/product/11.1.0/crs
New Nodes
Space Requirements
New Nodes
rac2
/: Required 3.45GB : Available 10.48GB
Installed Products
Product Names
Oracle Database 10g Release 2 Patch Set 4 11.1.0.5.0
Oracle Clusterware 11.1.0.6.0
Oracle Database 11g Patch Set 1 11.1.0.7.0
Bali Share 1.1.18.0.0
Oracle Ice Browser 5.2.3.6.0
Bali Share 1.1.19.0.0
Buildtools Common Files 11.1.0.5.0
Enterprise Manager Minimal Integration 11.1.0.5.0
SSL Required Support Files for InstantClient Patch 11.1.0.5.0
DBJAVA Required Support Files Patch 11.1.0.5.0
Agent Required Support Files Patch 11.1.0.5.0
Sun JDK 1.5.0.11.0
Cluster Verification Utility Files 11.1.0.6.0
Oracle Required Support Files 32 bit 11.1.0.6.0
Cluster Verification Utility Common Files 11.1.0.6.0
Oracle Clusterware RDBMS Files 11.1.0.6.0
Oracle Extended Windowing Toolkit 3.4.47.0.0
Buildtools Common Files 11.1.0.6.0
Oracle Notification Service 11.1.0.5.0
Oracle RAC Required Support Files-HAS 11.1.0.6.0
SQL*Plus Required Support Files 11.1.0.6.0
XDK Required Support Files 11.1.0.6.0
Agent Required Support Files 11.1.0.3.1
Parser Generator Required Support Files 11.1.0.6.0
Precompiler Required Support Files 11.1.0.6.0
Platform Required Support Files 11.1.0.6.0
Oracle Core Required Support Files 11.1.0.6.0
Oracle Globalization Support 11.1.0.6.0
Perl Interpreter 5.8.3.0.4
Oracle JFC Extended Windowing Toolkit 4.2.36.0.0
Oracle Help For Java 4.2.9.0.0
LDAP Required Support Files 11.1.0.6.0
SSL Required Support Files for InstantClient 11.1.0.6.0
Oracle Net Required Support Files 11.1.0.6.0
RDBMS Required Support Files for Instant Client 11.1.0.6.0
RDBMS Required Support Files 11.1.0.6.0
Enterprise Manager Minimal Integration 11.1.0.6.0
Oracle Locale Builder 11.1.0.6.0
Oracle Globalization Support 11.1.0.6.0
HAS Common Files 11.1.0.6.0
Cluster Ready Services Files 11.1.0.6.0
Required Support Files 11.1.0.6.0
Installer SDK Component 11.1.0.7.0
Oracle One-Off Patch Installer 11.1.0.7.0
Oracle Universal Installer 11.1.0.7.0
Oracle Notification Service Patch 11.1.0.7.0
Required Support Files Patch 11.1.0.7.0
Perl Interpreter Patch 5.8.3.0.4p
Oracle Required Support Files 32 bit Patch 11.1.0.7.0
Oracle Clusterware RDBMS Files Patch 11.1.0.7.0
Precompiler Required Support Files Patch 11.1.0.7.0
RDBMS Required Support Files for Instant Client Patch 11.1.0.7.0
XDK Required Support Files Patch 11.1.0.7.0
SQL*Plus Required Support Files Patch 11.1.0.7.0
Parser Generator Required Support Files Patch 11.1.0.7.0
Oracle Core Required Support Files Patch 11.1.0.7.0
Oracle Locale Builder Patch 11.1.0.7.0
Oracle Globalization Support Patch 11.1.0.7.0
Oracle Globalization Support Patch 11.1.0.7.0
Oracle Net Required Support Files Patch 11.1.0.7.0
RDBMS Required Support Files Patch 11.1.0.7.0
SSL Required Support Files for InstantClient Patch 11.1.0.7.0
LDAP Required Support Files Patch 11.1.0.7.0
Oracle RAC Required Support Files-HAS Patch 11.1.0.7.0
Cluster Verification Utility Files Patch 11.1.0.7.0
Cluster Ready Services Files Patch 11.1.0.7.0
Cluster Verification Utility Common Files Patch 11.1.0.7.0
HAS Common Files Patch 11.1.0.7.0
Oracle Clusterware Patch 11.1.0.7.0
Platform Required Support Files Patch 11.1.0.7.0
-------------------------------------------------------------------


Instantiating scripts for add node (Monday, December 6, 2010 1:02:44 PM GMT) 1% Done.
Instantiation of add node scripts complete

Copying to remote nodes (Monday, December 6, 2010 1:02:49 PM GMT)
........................................................ 96% Done.
Home copied to new nodes

Saving inventory on nodes (Monday, December 6, 2010 1:08:05 PM GMT) 100% Done.
Save inventory complete
WARNING:
The following configuration scripts need to be executed as the "root" user in each cluster node.
#!/bin/sh
#Root script to run
/opt/crs/oracle/product/11.1.0/crs/install/rootaddnode.sh #On nodes rac1
/opt/crs/oracle/product/11.1.0/crs/root.sh #On nodes rac2
To execute the configuration scripts:
1. Open a terminal window
2. Log in as "root"
3. Run the scripts in each cluster node

The Cluster Node Addition of /opt/crs/oracle/product/11.1.0/crs was successful.
Please check '/tmp/silentInstall.log' for more details.
As mentioned in the output above run the two scripts as root on the respective nodes. On rac1
# /opt/crs/oracle/product/11.1.0/crs/install/rootaddnode.sh
clscfg: EXISTING configuration version 4 detected.
clscfg: version 4 is 11 Release 1.
Attempting to add 1 new nodes to the configuration
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node :
node 2: rac2 rac2-pvt rac2
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
/opt/crs/oracle/product/11.1.0/crs/bin/srvctl add nodeapps -n rac2 -A rac2-vip/255.255.255.0/eth0
On the new node rac2
/opt/crs/oracle/product/11.1.0/crs/root.sh
No need to validate voting disks
Checking to see if Oracle CRS stack is already configured
OCR LOCATIONS = /dev/raw/raw1
OCR backup directory '/opt/crs/oracle/product/11.1.0/crs/cdata/mycluster' does not exist. Creating now
Setting the permissions on OCR backup directory
Setting up Network socket directories
Oracle Cluster Registry configuration upgraded successfully
clscfg: EXISTING configuration version 4 detected.
clscfg: version 4 is 11 Release 1.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node :
node 1: rac1 rac1-pvt rac1
clscfg: Arguments check out successfully.

NO KEYS WERE WRITTEN. Supply -force parameter to override.
-force is destructive and will destroy any previous cluster
configuration.
Oracle Cluster Registry for cluster has already been initialized
Startup will be queued to init within 30 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
Cluster Synchronization Services is active on these nodes.
rac1
rac2
Cluster Synchronization Services is active on all the nodes.
Waiting for the Oracle CRSD and EVMD to start
Oracle CRS stack installed and running under init(1M)
2. Running the above scripts would have created the nodeapps on rac2 (new node). But there's a problem with ONS on 11gR1. Therefore ONS may not be up and running on the new node
crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE rac1
ora....C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE OFFLINE
ora.rac2.vip application ONLINE ONLINE rac2
ora.racdb.db application ONLINE ONLINE rac1
ora....b1.inst application ONLINE ONLINE rac1
Tring to start will throw the following error
srvctl stop nodeapps -n rac2
srvctl start nodeapps -n rac2
rac2:ora.rac2.ons:Number of onsconfiguration retrieved, numcfg = 1
rac2:ora.rac2.ons:onscfg[0]
rac2:ora.rac2.ons: {node = rac1.domain.net, port = 6200}
rac2:ora.rac2.ons:Adding remote host rac1.domain.net:6200
rac2:ora.rac2.ons:Number of onsconfiguration retrieved, numcfg = 1
rac2:ora.rac2.ons:onscfg[0]
rac2:ora.rac2.ons: {node = rac1.domain.net, port = 6200}
rac2:ora.rac2.ons:Adding remote host rac1.domain.net:6200
rac2:ora.rac2.ons:onsctl: ons failed to start
rac2:ora.rac2.ons:Number of onsconfiguration retrieved, numcfg = 1
rac2:ora.rac2.ons:onscfg[0]
rac2:ora.rac2.ons: {node = rac1.domain.net, port = 6200}
rac2:ora.rac2.ons:Adding remote host rac1.domain.net:6200
rac2:ora.rac2.ons:ons is not running ...
CRS-0215: Could not start resource 'ora.rac2.ons'.
Remove the existing ONS configuration and add a new configuration with the full host name
racgons add_config rac2.domain.net:6200
srvctl start nodeapps -n rac2

crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE rac1
ora....C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
ora.racdb.db application ONLINE ONLINE rac1
ora....b1.inst application ONLINE ONLINE rac1
This concludes the phase one.



3. To add the Oracle Database software to new node run the $ORACLE_HOME/oui/bin/addNode.sh on an existing node.
$ORACLE_HOME/oui/bin/addNode.sh -silent "CLUSTER_NEW_NODES={rac2}"
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB. Actual 3812 MB Passed
Oracle Universal Installer, Version 11.1.0.7.0 Production
Copyright (C) 1999, 2008, Oracle. All rights reserved.


Performing tests to see whether nodes rac2 are available
............................................................... 100% Done.

-----------------------------------------------------------------------------
Cluster Node Addition Summary
Global Settings
Source: /opt/app/oracle/product/11.1.0/db_1
New Nodes
Space Requirements
New Nodes
rac2
/: Required 4.92GB : Available 7.77GB
Installed Products
Product Names
Oracle Database 11g 11.1.0.6.0
Oracle Database 11g Patch Set 1 11.1.0.7.0
Sun JDK 1.5.0.11.0
Oracle ODBC Driverfor Instant Client 11.1.0.6.0
LDAP Required Support Files 11.1.0.6.0
SSL Required Support Files for InstantClient 11.1.0.6.0
Oracle Net Required Support Files 11.1.0.6.0
Buildtools Common Files 11.1.0.6.0
Bali Share 1.1.18.0.0
Oracle Real Application Testing 11.1.0.6.0
Oracle Data Mining RDBMS Files 11.1.0.6.0
Oracle OLAP RDBMS Files 11.1.0.6.0
Oracle OLAP API 11.1.0.6.0
Oracle Extended Windowing Toolkit 3.4.47.0.0
Oracle JFC Extended Windowing Toolkit 4.2.36.0.0
SQL*Plus Required Support Files 11.1.0.6.0
Oracle RAC Required Support Files-HAS 11.1.0.6.0
XDK Required Support Files 11.1.0.6.0
Provisioning Advisor Framework 11.1.0.3.1
Enterprise Manager Database Plugin -- Repository Support 11.1.0.5.0
Enterprise Manager Repository Core Files 11.1.0.3.1
Enterprise Manager Database Plugin -- Management Service Support 11.1.0.5.0
Enterprise Manager Database Plugin -- Agent Support 11.1.0.5.0
Enterprise Manager Grid Control Core Files 11.1.0.3.1
Enterprise Manager Common Core Files 11.1.0.3.1
Enterprise Manager Agent Core Files 11.1.0.3.1
RDBMS Required Support Files for Instant Client 11.1.0.6.0
Oracle Display Fonts 9.0.2.0.0
RDBMS Required Support Files 11.1.0.6.0
Perl Interpreter 5.8.3.0.4
Oracle Ultra Search Common Files 11.1.0.6.0
Oracle Ultra Search Middle-Tier 11.1.0.6.0
Oracle Ultra Search Server 11.1.0.6.0
Oracle 11g Warehouse Builder Server 11.1.0.6.0
Agent Required Support Files 11.1.0.3.1
Oracle Database 11g Multimedia Files 11.1.0.6.0
Oracle Multimedia Java Advanced Imaging 11.1.0.6.0
Oracle Multimedia Annotator 11.1.0.6.0
Oracle Globalization Support 11.1.0.6.0
Oracle Multimedia Locator RDBMS Files 11.1.0.6.0
Parser Generator Required Support Files 11.1.0.6.0
Precompiler Required Support Files 11.1.0.6.0
Sample Schema Data 11.1.0.6.0
Oracle Starter Database 11.1.0.6.0
Oracle Message Gateway Common Files 11.1.0.6.0
Oracle XML Query 11.1.0.6.0
XML Parser for Oracle JVM 11.1.0.6.0
Oracle JDBC/OCI Instant Client 11.1.0.6.0
Installation Plugin Files 11.1.0.6.0
Enterprise Manager Common Files 11.1.0.3.1
regexp 2.1.9.0.0
Oracle JDBC Server Support Package 11.1.0.6.0
Oracle SQL Developer 11.1.0.6.0
Oracle Application Express 11.1.0.6.0
Oracle Ice Browser 5.2.3.6.0
Platform Required Support Files 11.1.0.6.0
Oracle Core Required Support Files 11.1.0.6.0
SQLJ Runtime 11.1.0.6.0
Database Workspace Manager 11.1.0.6.0
JAccelerator (COMPANION) 11.1.0.6.0
Oracle Containers for Java 11.1.0.6.0
Oracle Help For Java 4.2.9.0.0
Oracle Ultra Search Server Rdbms 11.1.0.6.0
Oracle Code Editor 1.2.1.0.0I
Oracle Required Support Files 32 bit 11.1.0.6.0
Oracle Universal Connection Pool 11.1.0.6.0
Oracle Multimedia Client Option 11.1.0.6.0
Oracle JDBC/THIN Interfaces 11.1.0.6.0
Oracle Java Client 11.1.0.6.0
Secure Socket Layer 11.1.0.6.0
Oracle Locale Builder 11.1.0.6.0
Character Set Migration Utility 11.1.0.6.0
Oracle Globalization Support 11.1.0.6.0
PL/SQL Embedded Gateway 11.1.0.6.0
OLAP SQL Scripts 11.1.0.6.0
Database SQL Scripts 11.1.0.6.0
Required Support Files 11.1.0.6.0
Oracle ODBC Driver 11.1.0.6.0
SQL*Plus Files for Instant Client 11.1.0.6.0
Oracle Database User Interface 2.2.13.0.0
Enterprise Manager Minimal Integration 11.1.0.6.0
XML Parser for Java 11.1.0.6.0
Oracle Security Developer Tools 11.1.0.6.0
Oracle Wallet Manager 11.1.0.6.0
Cluster Verification Utility Common Files 11.1.0.6.0
Oracle Clusterware RDBMS Files 11.1.0.6.0
Precompiler Common Files 11.1.0.6.0
Oracle UIX 2.2.20.0.0
Oracle Help for the Web 2.0.14.0.0
HAS Common Files 11.1.0.6.0
SQL*Plus 11.1.0.6.0
Oracle LDAP administration 11.1.0.6.0
Enterprise Manager plugin Common Files 11.1.0.5.0
Installation Common Files 11.1.0.6.0
Assistant Common Files 11.1.0.6.0
Oracle Notification Service 11.1.0.5.0
Oracle Net 11.1.0.6.0
Oracle Recovery Manager 11.1.0.6.0
PL/SQL 11.1.0.6.0
Secure Socket Layer 11.1.0.6.0
Oracle Database Utilities 11.1.0.6.0
Oracle Internet Directory Client 11.1.0.6.0
Oracle Multimedia Locator 11.1.0.6.0
Oracle Multimedia 11.1.0.6.0
Generic Connectivity Common Files 11.1.0.6.0
Oracle XML Development Kit 11.1.0.6.0
Database Configuration and Upgrade Assistants 11.1.0.6.0
Oracle JVM 11.1.0.6.0
Oracle Advanced Security 11.1.0.6.0
Oracle Database Gateway for ODBC 11.1.0.6.0
Oracle Programmer 11.1.0.6.0
Enterprise Manager Agent 11.1.0.3.1
Oracle Call Interface (OCI) 11.1.0.6.0
HAS Files for DB 11.1.0.6.0
Oracle Net Listener 11.1.0.6.0
Oracle Enterprise Manager Console DB 11.1.0.5.0
Oracle Net Services 11.1.0.6.0
Oracle Text 11.1.0.6.0
Oracle Database 11g 11.1.0.6.0
Oracle OLAP 11.1.0.6.0
Oracle Spatial 11.1.0.6.0
Oracle Partitioning 11.1.0.6.0
Enterprise Edition Options 11.1.0.6.0
Installer SDK Component 11.1.0.7.0
Oracle One-Off Patch Installer 11.1.0.7.0
Oracle Universal Installer 11.1.0.7.0
Oracle Warehouse Builder Required Support Files 11.1.0.7.0
Oracle Configuration Manager 10.3.0.1.0
Oracle Notification Service Patch 11.1.0.7.0
Required Support Files Patch 11.1.0.7.0
Oracle 11g Warehouse Builder Server 11.1.0.7.0
Enterprise Manager Repository Core Files 11.1.0.4.1
Enterprise Manager Common Core Files Patch 11.1.0.4.1
Enterprise Manager Agent Core Files Patch 11.1.0.4.1
Enterprise Manager Grid Control Core Files Patch 11.1.0.4.1
Provisioning Advisor Framework Patch 11.1.0.4.1
Enterprise Manager Common Files Patch 11.1.0.4.1
Enterprise Manager Agent Patch 11.1.0.4.1
Enterprise Manager Database Plugin -- Agent Support Patch 11.1.0.7.0
Enterprise Manager Database Plugin -- Repository Support Patch 11.1.0.7.0
Enterprise Manager plugin Common Files Patch 11.1.0.7.0
Oracle Enterprise Manager Database Console Patch 11.1.0.7.0
Perl Interpreter Patch 5.8.3.0.4p
Oracle Net Listener Patch 11.1.0.7.0
Oracle Required Support Files 32 bit Patch 11.1.0.7.0
Secure Socket Layer Patch 11.1.0.7.0
Oracle Partitioning Patch 11.1.0.7.0
Oracle Multimedia Locator RDBMS Files Patch 11.1.0.7.0
Oracle Clusterware RDBMS Files Patch 11.1.0.7.0
Oracle Database 11g Multimedia Files Patch 11.1.0.7.0
Oracle Real Application Testing Patch 11.1.0.7.0
PL/SQL Patch 11.1.0.7.0
Oracle OLAP RDBMS Files Patch 11.1.0.7.0
Oracle Data Mining RDBMS Files Patch 11.1.0.7.0
Precompiler Required Support Files Patch 11.1.0.7.0
Oracle Multimedia Annotator Patch 11.1.0.7.0
JAccelerator (COMPANION) Patch 11.1.0.7.0
Oracle ODBC Driver Patch 11.1.0.7.0
Oracle Call Interface (OCI) Patch 11.1.0.7.0
RDBMS Required Support Files for Instant Client Patch 11.1.0.7.0
PL/SQL Embedded Gateway Patch 11.1.0.7.0
Oracle Advanced Security Patch 11.1.0.7.0
Oracle JDBC/THIN Interfaces Patch 11.1.0.7.0
Installation Plugin Files Patch 11.1.0.7.0
Oracle LDAP administration Patch 11.1.0.7.0
Oracle Ultra Search Server Rdbms Patch 11.1.0.7.0
Oracle Universal Connection Pool Patch 11.1.0.7.0
Oracle XML Development Kit Patch 11.1.0.7.0
Oracle XML Query Patch 11.1.0.7.0
XDK Required Support Files Patch 11.1.0.7.0
XML Parser for Java Patch 11.1.0.7.0
SQL*Plus Patch 11.1.0.7.0
SQL*Plus Required Support Files Patch 11.1.0.7.0
SQL*Plus Files for Instant Client Patch 11.1.0.7.0
SQLJ Runtime Patch 11.1.0.7.0
Parser Generator Required Support Files Patch 11.1.0.7.0
Oracle Database 11g Patch 11.1.0.7.0
Oracle Spatial Patch 11.1.0.7.0
Oracle Multimedia Locator Patch 11.1.0.7.0
Oracle Database 11g Patch 11.1.0.7.0
Oracle Database Utilities Patch 11.1.0.7.0
Installation Common Files Patch 11.1.0.7.0
Oracle Recovery Manager Patch 11.1.0.7.0
Oracle Starter Database Patch 11.1.0.7.0
Sample Schema Data Patch 11.1.0.7.0
Oracle Database Gateway for ODBC Patch 11.1.0.7.0
Generic Connectivity Common Files Patch 11.1.0.7.0
Database SQL Scripts Patch 11.1.0.7.0
Precompiler Common Files Patch 11.1.0.7.0
Database Workspace Manager Patch 11.1.0.7.0
Oracle Multimedia Patch 11.1.0.7.0
Oracle Multimedia Java Advanced Imaging Patch 11.1.0.7.0
Oracle Multimedia Client Option Patch 11.1.0.7.0
OLAP SQL Scripts Patch 11.1.0.7.0
Oracle OLAP API Patch 11.1.0.7.0
Oracle OLAP Patch 11.1.0.7.0
Oracle Core Required Support Files Patch 11.1.0.7.0
Oracle ODBC Driverfor Instant Client Patch 11.1.0.7.0
Oracle Locale Builder Patch 11.1.0.7.0
Oracle Globalization Support Patch 11.1.0.7.0
Oracle Globalization Support Patch 11.1.0.7.0
Secure Socket Layer Patch 11.1.0.7.0
Oracle Net Required Support Files Patch 11.1.0.7.0
Oracle Net Patch 11.1.0.7.0
RDBMS Required Support Files Patch 11.1.0.7.0
Oracle Message Gateway Common Files Patch 11.1.0.7.0
Oracle Application Express Patch 11.1.0.7.0
Oracle Security Developer Tools Patch 11.1.0.7.0
SSL Required Support Files for InstantClient Patch 11.1.0.7.0
LDAP Required Support Files Patch 11.1.0.7.0
Oracle Wallet Manager Patch 11.1.0.7.0
Oracle Internet Directory Client Patch 11.1.0.7.0
Oracle Containers for Java Patch 11.1.0.7.0
Oracle Java Client Patch 11.1.0.7.0
Oracle JVM Patch 11.1.0.7.0
Oracle Ultra Search Server Patch 11.1.0.7.0
Oracle Ultra Search Common Files Patch 11.1.0.7.0
Oracle Ultra Search Middle-Tier Patch 11.1.0.7.0
Oracle RAC Required Support Files-HAS Patch 11.1.0.7.0
HAS Files for DB Patch 11.1.0.7.0
Cluster Verification Utility Common Files Patch 11.1.0.7.0
HAS Common Files Patch 11.1.0.7.0
Oracle JDBC/OCI Instant Client Patch 11.1.0.7.0
Oracle SQL Developer Patch 11.1.0.7.0
Oracle Text Patch 11.1.0.7.0
Character Set Migration Utility Patch 11.1.0.7.0
Platform Required Support Files Patch 11.1.0.7.0
Database Configuration and Upgrade Assistants Patch 11.1.0.7.0
Assistant Common Files Patch 11.1.0.7.0
-----------------------------------------------------------------------------


Instantiating scripts for add node (Monday, December 6, 2010 2:01:19 PM GMT) 1% Done.
Instantiation of add node scripts complete

Copying to remote nodes (Monday, December 6, 2010 2:01:47 PM GMT)
.......................................................... 96% Done.
Home copied to new nodes

Saving inventory on nodes (Monday, December 6, 2010 2:13:10 PM GMT) 100% Done.
Save inventory complete
WARNING:
The following configuration scripts need to be executed as the "root" user in each cluster node.
#!/bin/sh
#Root script to run
/opt/app/oracle/product/11.1.0/db_1/root.sh #On nodes rac2
To execute the configuration scripts:
1. Open a terminal window
2. Log in as "root"
3. Run the scripts in each cluster node

The Cluster Node Addition of /opt/app/oracle/product/11.1.0/db_1 was successful.
Please check '/tmp/silentInstall.log' for more details.
Run the above mentioned root script.

This concludes phase two.

4. On the new node run the Net Configuration Assistant (NETCA) to add a listener, select only the new node from the node selection list. Alternately could select reconfigure option which will reconfigure listeners for the entire RAC. After the reconfiguration listeners must be manually started on all nodes, this may not be desireable for a production system.

5. From an exisitng node run dbca to extend the database to the new node. This process will add a new node to the cluster. From the options on dbca select instance managment and add instance option and select the new node and proceed to completion.

While extending the database instance ASM will also be extended if the existing nodes also use ASM. Following message could be seen in these situations.End of this operation also concludes the addition of the new node.

Related Post
Adding a Node to 11gR2 RAC

Deleting a 11gR1 RAC Node

To remove a node from a which is part of a RAC involves several phases. First phase is to remove instances (both database and ASM) and to stop nodeapps. Second phase is to remove the Oracle Database software and remove the nodeapps final phase is to remove the clusterware. These must be done in this order to cleanly remove a node from RAC.

A 11gR1 (11.1.0.7) RAC two two nodes (rac1, rac2) has been used here, and rac2 will be deleted.

0. Backup vote disk and ocr

1. Run dbca from a node that is not being deleted.(in this case rac1) Select instance management from the options and delete instance option and proceed to select the instance on the node that is to be deleted (database instance on rac2).

2. If ASM on that instance is no longer needed (no other instances are on that node) then remove the ASM instance on that node by running
srvctl stop asm -n rac2
srvctl remove asm -n rac2
3. Stop the nodeapps and remove the listener on the node to be deleted using netca.

This concludes the first phase.

4. Remove the Oracle Database software by running the following command on the node to be deleted
$ORACLE_HOME/oui/bin/runInstaller -deinstall -silent "REMOVE_HOMES={/opt/app/oracle/product/11.1.0/db_1}" -local
5. If database resources are running on the node to be deleted then relocate them to a surviving node.
crs_stat

NAME=ora.racdb.db
TYPE=application
TARGET=ONLINE
STATE=ONLINE on rac2
It maybe the case no resources are running on the node to be deleted, then relocating is not necessary. If there are resources then relocate with
crs_relocate ora.racdb.db
6. Remove nodeapps by running the following as root
srvctl remove nodeapps -n rac2
Please confirm that you intend to remove the node-level applications on node rac2 (y/[n]) y
7. Add the updated node list in the oracle inventory. Run this on any surviving node.
$ORACLE_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME CLUSTER_NODES=rac1
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 3906 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /opt/app/oracle/oraInventory
'UpdateNodeList' was successful.
This concludes the second phase.




8. Remove stored network inetrfaces by runnign the following any one of the remaining nodes. This step is unnecessary if you ran the Oracle Interface Configuration Tool (OIFCFG) with the -global option during the installation, such as with Oracle Universal Installer
oifcfg delif –node rac2
9. Identify and remove the ons remote ports, run the remove command on a node that is going to remain in the cluster.
$CRS_HOME/bin/onsctl ping
Number of onsconfiguration retrieved, numcfg = 2
onscfg[0]
{node = rac1.domain.net, port = 6200}
Adding remote host rac1.domain.net:6200
onscfg[1]
{node = rac2.domain.net, port = 6200}
Setting remote port from OCR repository to 6200
Adding remote host rac2.domain.net:6200
ons is not running ...                       

racgons remove_config rac2.domain.net:6200
racgons: Existing key value on rac2.domain.net = 6200.
racgons: rac2.domain.net:6200 removed from OCR.
10. On the node to be deleted, run the rootdelete.sh script as the root user from the CRS_HOME/install directory to disable the Oracle Clusterware applications and daemons running on the node. rootdelete.sh nosharedhome

Only run this command once and use the nosharedhome argument if you are using a local file system. The nosharedvar option assumes the OCR.LOC file is
not on a shared file system. The default for this command is sharedhome that prevents you from updating the permissions of local files such that they can be
removed by the oracle user. If the ocr.loc file is on a shared file system (default is nosharedvar), then run the CRS_HOME/install/rootdelete.sh remote sharedvar command.
rootdelete.sh nosharedhome
Getting local node name
NODE = rac2
Getting local node name
NODE = rac2
CRS-0210: Could not find resource 'ora.rac2.ons'.
CRS-0210: Could not find resource 'ora.rac2.vip'.
CRS-0210: Could not find resource 'ora.rac2.gsd'.
Stopping resources.
This could take several minutes.
Successfully stopped Oracle Clusterware resources
Stopping Cluster Synchronization Services.
Shutting down the Cluster Synchronization Services daemon.
Shutdown request successfully issued.
Waiting for Cluster Synchronization Services daemon to stop
Waiting for Cluster Synchronization Services daemon to stop
Cluster Synchronization Services daemon has stopped
Oracle CRS stack is not running.
Oracle CRS stack is down now.
Removing script for Oracle Cluster Ready services
Updating ocr file for downgrade
Cleaning up SCR settings in '/etc/oracle/scls_scr'
Cleaning up Network socket directories
11. To delete the node from the cluster as root from any surviving node run $CRS_HOME/install/rootdeletenode.sh. Identify the node name and number with olsnode. If this step is not performed the olsnodes command will continue to display the deleted node as a part of the cluster.
olsnodes -n
rac1    1
rac2    2

rootdeletenode.sh rac2,2
CRS-0210: Could not find resource 'ora.rac2.ons'.
CRS-0210: Could not find resource 'ora.rac2.vip'.
CRS-0210: Could not find resource 'ora.rac2.gsd'.
PRKO-2112 : Some or all node applications are not removed successfully on node: rac2
CRS-0210: Could not find resource 'ora.rac2.vip'.CRS-0210: Could not find resource 'ora.rac2.ons'.CRS-0210: Could not find resource 'ora.rac2.gsd'.
CRS nodeapps are deleted successfully
clscfg: EXISTING configuration version 4 detected.
clscfg: version 4 is 11 Release 1.
Value SYSTEM.crs.versions.rac2 marked for deletion is not there. Ignoring.
Successfully deleted 15 values from OCR.
Key SYSTEM.css.interfaces.noderac2 marked for deletion is not there. Ignoring.
Key SYSTEM.crs.versions.rac2 marked for deletion is not there. Ignoring.
Successfully deleted 13 keys from OCR.
Node deletion operation successful.
'rac2,2' deleted successfully
12. Remove the node from the inventory list by running the following on the node that's been deleted.
runInstaller -updateNodeList ORACLE_HOME=$CRS_HOME "CLUSTER_NODES={rac2}" CRS=TRUE -local -silent
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 4094 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /opt/app/oracle/oraInventory
13. Deinstall (in a non shared crs home) the cluster home by running the following on the node that's been deleted. THIS STEP COULD BE A DISASTER IF THE STEP 12 HASN'T REMOVED THE INVENTORY ENTRY OF OTHER NODES ON THE LOCAL INVENTORY FILE. IF IT HASN'T REMOVED THOSE NODES FROM THE INVENTORY OF THE LOCAL NODE WHICH IS BEING DELTED THEN THIS STEP WOULD WIPE OUT THE CLUSTERWARE HOME FILES ON ALL THE NODES IN THE CLUSTER. CLUSTER WILL AUTO SHUTDOWN AFTER SOMETIME AND NEW INSTALL OF CLSUTERWARE IS NEEDED TO RECOVER FROM THIS. COMMAND IS LISTED ON ORACLE DOCUMENTATION Step 8. Maybe the command should be of the form
runInstaller -deinstall -silent "REMOVE_HOMES={/opt/crs/oracle/product/11.1.0/crs}" -local
"-local" is missing on Oracle documentation.
It worked fine on two node cluster during test but failed and wiped out the clusterware files on two clusters with 3 or more nodes even after running the step 12 above ( which suceeded without error but inventory file wasn't checked explicity to see if other nodes were removed). Exact reason is not known yet blog will be updated as things progress.
Take EXTREME caution with this step when following Oracle document. This step could be skipped (and go to 14) and deletion of the crs home could be done manually using OS utility. Any residual files should be taken care of as well.
runInstaller -deinstall -silent "REMOVE_HOMES={/opt/crs/oracle/product/11.1.0/crs}" -local
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 4094 MB    Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2010-12-06_11-47-38AM. Please wait ...
Oracle Universal Installer, Version 11.1.0.7.0 Production
Copyright (C) 1999, 2008, Oracle. All rights reserved.

Starting deinstall

Deinstall in progress (Monday, December 6, 2010 11:47:47 AM GMT)
WARNING:The directory: /opt/crs/oracle/product/11.1.0/crs will be deleted after deinstall.
Click on "Yes" to continue.
Click on "No" to perform deinstall without deleting the directory.
Click on "Cancel" to go back to "Inventory Dialog".
............................................................... 100% Done.

Deinstall successful
Removing Cluster Oracle homes
SEVERE:Remote 'RemoveHome' failed on nodes: 'rac2'. Refer to '/opt/app/oracle/oraInventory/logs/installActions2010-12-06_11-47-38AM.log' for details.
You can manually re-run the following command on the failed nodes after the installation:
/opt/crs/oracle/product/11.1.0/crs/oui/bin/runInstaller -removeHome -noClusterEnabled ORACLE_HOME=/opt/crs/oracle/product/11.1.0/crs -cfs  LOCAL_NODE=.

End of install phases.(Monday, December 6, 2010 11:49:01 AM GMT)
End of deinstallations
Please check '/opt/app/oracle/oraInventory/logs/silentInstall2010-12-06_11-47-38AM.log' for more details
Inspite the error msg crs home is cleared, all daemons,inittab entries are removed.

14. On any remaining node run the following command to update the node list. Check the inventory.xml on all nodes aftewards to verify node has been removed from inventory.xml
runInstaller -updateNodeList ORACLE_HOME=$CRS_HOME "CLUSTER_NODES={rac1}" CRS=TRUE
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 3907 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /opt/app/oracle/oraInventory
'UpdateNodeList' was successful.
This concludes the final phase.

Related Post
Deleting a Node From 11gR2 RAC

How to mount an ISO image file

1. Create a directory (mount point) fro the iso file.

2. As root mount the file, in this case default /media directory has been used as the mount point
mount -o loop rhel-server-5.5-x86_64-dvd.iso /media
3. To access the files inside the iso cd to /media.

A loop device is a pseudo-device that makes a file accessible as a block device

Monday, December 6, 2010

Upgrading 10gR2 RAC to 11gR1 on RHEL 4

Steps to upgrade 10gR2 RAC on RHEL 4 to 11gR1 isn't much different to the previous post

1. Upgrade 10g version is 10.2.0.5 (with PSU 10.2.0.5.1)

2. As per metalink note for installing 11g on RHEL4 437123.1

Both RHEL AS/ES 4 "update 1" and "update 2" had a problem with the binutils RPM. This problem was not corrected until "update 3". Therefore, you MUST at least use the binutils RPM from "update 3" (or higher). The "update 3" version of the binutils RPM is binutils-2.15.92.0.2-18 (x86_64). Because of the two above problems, Oracle Global Support strongly recommends that you use Red Hat Enterprise Linux ES/AS 4 (update 3 or higher). This is kernel 2.6.9-34 or greater.

3. Make sure any additional rpms needed for 11g version are installed. Required rpms are listed on the above metalink note.

4. Do the additional pre-reqs that are not in 10g. Adding the following to /etc/profile is an example of such pre-reqs that weren't on 10g.
if [ $USER = "root" ]; then
if [ $SHELL = "/bin/ksh" ]; then
ulimit -p 16384
ulimit -n 65536
else
ulimit -u 16384 -n 65536
fi
fi
5. Do the rest of the upgrade same as on RHEL 5

Tuesday, November 30, 2010

RAC to Single Instance Physical Standby

Oracle 11g new feature of active database duplication could also be used for creating standby databases in data guard configurations. This blog uses the same RAC configuration used for active database duplication to create a single instance physical standby database.

RAC instances are
olsnodes -n
rac4    1
rac5    2
Data files and logfiles are in two ASM diskgroups called +DATA and +FLASH.

Physical standby database will have it's files in the local file system and will be referred to as stdby through out the blog.

1. Instance Oracle Enterprise Edition software on the host where physical standby will reside. In addition to this also create the necessary directory structures such as adump ($ORACLE_BASE/admin/sid name/adump) and directories for controlfiles, datafiles and onlinelogs, though the configuration uses OMF once the setup is completed, these directories are required in the beginning to complete the setup. (not required if instead of OMF, some other directory path is referenced). For this configuration following directories were created
cd /data/oradata
mkdir STDBY
cd STDBY
mkdir controlfile  datafile  onlinelog

cd cd /data/flash_recovery
mkdir STDBY
cd STDBY
mkdir onlinelog
2. Create TNS entries on both RAC node's tnsnames.ora file
STDBYTNS =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = standby-host)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SID = stdby)
)
)
3. Create TNS entry on the standby's tnsnames.ora file
PRIMARYTNS =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = rac1-host)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = rac11g2)
)
)
In this case only one instance will be used to fetch archive log gaps and to send redo when the switch over happens. In a RAC standby multiple instances can receive but there can only be one applier.

4. Add a static listener entry to listener.ora
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(GLOBAL_DBNAME = stdby)
(SID_NAME = stdby)
(ORACLE_HOME = /opt/app/oracle/product/11.2.0/ent)
)
)
5. Enable force logging on the primary
SQL> alter database force logging;
6. Create standby log files for each thread on primary. These should be same size as the online redo log files
SQL> alter database add standby logfile thread 1 size 52428800;
or
SQL> alter database add standby logfile thread 1;
SQL> alter database add standby logfile thread 2;
There should be at least one more redo log group per thread than the online redo logs.

6. Add Data Guard related initialization parameters to primary. These include, information about the instances involved in the data guard configuration, redo transport parameters such SYNC,ASYNC, AFFIRM, NOAFFIRM, fetch archive log client and server values and datafile/logfile name conversions.
alter system set log_archive_config='dg_config=(rac11g2,stdby)' scope=both ;
alter system set log_archive_dest_1='location=use_db_recovery_file_dest valid_for=(all_logfiles,all_roles) db_unique_name=rac11g2' scope=both;
alter system set log_archive_dest_2='service=STDBYTNS LGWR ASYNC NOAFFIRM max_failure=10 max_connections=5 reopen=180 valid_for=(online_logfiles,primary_role) db_unique_name=stdby' scope=both;
alter system set log_archive_dest_state_1='enable' scope=both;
alter system set log_archive_dest_state_2='enable' scope=both;
alter system set fal_server='STDBYTNS' scope=both;
alter system set fal_client='PRIMARYTNS' scope=both;
alter system set log_archive_max_processes=10 scope=both;
alter system set db_file_name_convert='/data/oradata/STDBY','+DATA/rac11g2' scope=spfile;
alter system set log_file_name_convert='/data/flash_recovery/STDBY','+FLASH/rac11g2' scope=spfile;
alter system set standby_file_management='AUTO' scope=both;
7. Copy the password file to standby host's ORACLE_HOME/dbs and rename the file. Assuming password file is copied to the standby host
mv orapwrac11g22 orapwstdby
8. create a pfile with db_name as the only entry.
*.db_name='stdby'
9. Start the standby instance using the above mentioned pfile in nomount mode
startup nomount pfile=initstdby.ora
10. On the primary using rman connect to primary db as target and standby as the auxiliary and run the active duplication command to create the standby. Some of the RAC only parameters has been reset while others have been set to reflect the standby database after switch over.
rman target / auxiliary sys/rac11g2db@stdbytns

duplicate target database for standby from active database
spfile
parameter_value_convert 'rac11g2','stdby','RAC11G2','stdby'
set db_unique_name='stdby'
set db_file_name_convert='+DATA/rac11g2','/data/oradata/STDBY','+DATA/rac11g2/tempfile','/data/oradata/STDBY'
set log_file_name_convert='+FLASH/rac11g2','/data/flash_recovery/STDBY','+DATA/rac11g2','/data/flash_recovery/STDBY'
set control_files='/data/oradata/stdby/controlfile/control01.ctl'
set log_archive_max_processes='5'
set fal_client='STDBYTNS'
set fal_server='PRIMARYTNS'
SET cluster_database='false'
reset REMOTE_LISTENER
reset local_listener
set db_create_file_dest  = '/data/oradata'
set db_recovery_file_dest  = '/data/flash_recovery'
set standby_file_management='AUTO'
set log_archive_config='dg_config=(rac11g2,stdby)'
set log_archive_dest_2='service=PRIMARYTNS LGWR ASYNC NOAFFIRM max_failure=10 max_connections=5 reopen=180 valid_for=(online_logfiles,primary_role) db_unique_name=rac11g2'
set log_archive_dest_1='location=use_db_recovery_file_dest valid_for=(all_logfiles,all_roles) db_unique_name=stdby';
This will start the duplication and creation of the physical standby
Starting Duplicate Db at 30-NOV-10
using target database control file instead of recovery catalog
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: SID=10 device type=DISK

contents of Memory Script:
{
backup as copy reuse
targetfile  '/opt/app/oracle/product/11.2.0/db_1/dbs/orapwrac11g22' auxiliary format
'/opt/app/oracle/product/11.2.0/ent/dbs/orapwstdby'   ;
}
executing Memory Script
...
...
...
Finished Duplicate Db at 30-NOV-10
If any parameter setting has some configuration mismatches and still referrers to ASM for files then the duplication process will terminate with
ERROR: slave communication error with ASM; terminating process 21416
Errors in file /opt/app/oracle/diag/rdbms/stdby/stdby/trace/stdby_lgwr_21416.trc:
Mon Nov 29 17:40:49 2010
PMON (ospid: 21396): terminating the instance due to error 470
Instance terminated by PMON, pid = 21396
11. Once successfully completed start redo apply on the standby with
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT
12. Force a log switch on the primary
alter system switch logfile;
or
alter database archive log current;
13. Check if the logs are applied on the standby with
select thread#,sequence#,applied from v$archived_log;

THREAD#  SEQUENCE# APPLIED
---------- ---------- ---------
1        108 YES
1        109 YES
1        110 YES
1        111 YES
2         98 YES
2         99 YES
2        100 YES
1        112 IN-MEMORY


To further checks could be performed by creating a tablespace on primary and checking whether the changes get reflected appropriately only the standby.

The spfile created in this scenario will have the RAC instance level values similar to active duplication.

Alternately a pfile for the standby could be created with all the necessary parameter entries and use that to start the duplicate in nomount state. Then when running the duplicate command and omit the spfile clause.

In this scenario pfile created on step 8 would be
*.audit_file_dest='/opt/app/oracle/admin/stdby/adump'
*.audit_trail='OS'
*.compatible='11.2.0.0.0'
*.control_files='/data/oradata/STDBY/controlfile/control01.ctl'
*.db_block_size=8192
*.db_create_file_dest='/data/oradata'
*.db_domain='domain.net'
*.db_name='rac11g2'
*.db_unique_name='stdby'
*.db_file_name_convert='+DATA/rac11g2','/data/oradata/STDBY','+DATA/rac11g2/tempfile','/data/oradata/STDBY'
*.log_file_name_convert='+FLASH/rac11g2','/data/flash_recovery/STDBY','+DATA/rac11g2','/data/flash_recovery/STDBY'
*.db_recovery_file_dest='/data/flash_recovery'
*.db_recovery_file_dest_size=40705720320
*.diagnostic_dest='/opt/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=stdbyXDB)'
*.log_archive_format='%t_%s_%r.dbf'
*.log_archive_config='dg_config=(rac11g2,stdby)'
*.log_archive_dest_2='service=PRIMARYTNS LGWR ASYNC NOAFFIRM max_failure=10 max_connections=5 reopen=180 valid_for=(online_logfiles,primary_role) db_unique_name=rac11g2'
*.standby_file_management='AUTO'
*.log_archive_dest_1='location=use_db_recovery_file_dest valid_for=(all_logfiles,all_roles) db_unique_name=stdby'
*.open_cursors=300
*.pga_aggregate_target=1326448640
*.processes=150
*.remote_login_passwordfile='EXCLUSIVE'
*.sga_target=3707764736
*.undo_tablespace='UNDOTBS1'
*.fal_client='STDBYTNS'
*.fal_server='PRIMARYTNS'
Duplication command on step 10 would be
duplicate target database for standby from active database;
This will not create a spfile for the standby and at the end of the duplication command a spfile should be created explicitly. Complete output for this last scenario is given below
rman target / auxiliary sys/rac11g2db@stdbytns

Recovery Manager: Release 11.2.0.1.0 - Production on Tue Nov 30 13:02:16 2010

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

connected to target database: RAC11G2 (DBID=371695083)
connected to auxiliary database: RAC11G2 (not mounted)

RMAN> duplicate target database for standby from active database;

Starting Duplicate Db at 30-NOV-10
using target database control file instead of recovery catalog
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: SID=10 device type=DISK

contents of Memory Script:
{
backup as copy reuse
targetfile  '/opt/app/oracle/product/11.2.0/db_1/dbs/orapwrac11g22' auxiliary format
'/opt/app/oracle/product/11.2.0/ent/dbs/orapwstdby'   ;
}
executing Memory Script

Starting backup at 30-NOV-10
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=152 instance=rac11g22 device type=DISK
Finished backup at 30-NOV-10

contents of Memory Script:
{
backup as copy current controlfile for standby auxiliary format  '/data/oradata/stdby/control01.ctl';
}
executing Memory Script

Starting backup at 30-NOV-10
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile copy
copying standby control file
output file name=/opt/app/oracle/product/11.2.0/db_1/dbs/snapcf_rac11g22.f tag=TAG20101130T130231 RECID=33 STAMP=736434151
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:03
Finished backup at 30-NOV-10

contents of Memory Script:
{
sql clone 'alter database mount standby database';
}
executing Memory Script

sql statement: alter database mount standby database

contents of Memory Script:
{
set newname for tempfile  1 to
"/data/oradata/stdby/tempfile/temp.263.732796409";
switch clone tempfile all;
set newname for datafile  1 to
"/data/oradata/stdby/datafile/system.256.732796287";
set newname for datafile  2 to
"/data/oradata/stdby/datafile/sysaux.257.732796289";
set newname for datafile  3 to
"/data/oradata/stdby/datafile/undotbs1.258.732796289";
set newname for datafile  4 to
"/data/oradata/stdby/datafile/users.259.732796291";
set newname for datafile  5 to
"/data/oradata/stdby/datafile/undotbs2.264.732796603";
backup as copy reuse
datafile  1 auxiliary format
"/data/oradata/stdby/datafile/system.256.732796287"   datafile
2 auxiliary format
"/data/oradata/stdby/datafile/sysaux.257.732796289"   datafile
3 auxiliary format
"/data/oradata/stdby/datafile/undotbs1.258.732796289"   datafile
4 auxiliary format
"/data/oradata/stdby/datafile/users.259.732796291"   datafile
5 auxiliary format
"/data/oradata/stdby/datafile/undotbs2.264.732796603"   ;
sql 'alter system archive log current';
}
executing Memory Script

executing command: SET NEWNAME

renamed tempfile 1 to /data/oradata/stdby/tempfile/temp.263.732796409 in control file

executing command: SET NEWNAME

executing command: SET NEWNAME

executing command: SET NEWNAME

executing command: SET NEWNAME

executing command: SET NEWNAME

Starting backup at 30-NOV-10
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile copy
input datafile file number=00001 name=+DATA/rac11g2/datafile/system.256.732796287
output file name=/data/oradata/stdby/datafile/system.256.732796287 tag=TAG20101130T130241
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:01:05
channel ORA_DISK_1: starting datafile copy
input datafile file number=00002 name=+DATA/rac11g2/datafile/sysaux.257.732796289
output file name=/data/oradata/stdby/datafile/sysaux.257.732796289 tag=TAG20101130T130241
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:55
channel ORA_DISK_1: starting datafile copy
input datafile file number=00003 name=+DATA/rac11g2/datafile/undotbs1.258.732796289
output file name=/data/oradata/stdby/datafile/undotbs1.258.732796289 tag=TAG20101130T130241
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:15
channel ORA_DISK_1: starting datafile copy
input datafile file number=00005 name=+DATA/rac11g2/datafile/undotbs2.264.732796603
output file name=/data/oradata/stdby/datafile/undotbs2.264.732796603 tag=TAG20101130T130241
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:15
channel ORA_DISK_1: starting datafile copy
input datafile file number=00004 name=+DATA/rac11g2/datafile/users.259.732796291
output file name=/data/oradata/stdby/datafile/users.259.732796291 tag=TAG20101130T130241
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
Finished backup at 30-NOV-10

sql statement: alter system archive log current

contents of Memory Script:
{
switch clone datafile all;
}
executing Memory Script

datafile 1 switched to datafile copy
input datafile copy RECID=33 STAMP=736434316 file name=/data/oradata/stdby/datafile/system.256.732796287
datafile 2 switched to datafile copy
input datafile copy RECID=34 STAMP=736434316 file name=/data/oradata/stdby/datafile/sysaux.257.732796289
datafile 3 switched to datafile copy
input datafile copy RECID=35 STAMP=736434316 file name=/data/oradata/stdby/datafile/undotbs1.258.732796289
datafile 4 switched to datafile copy
input datafile copy RECID=36 STAMP=736434316 file name=/data/oradata/stdby/datafile/users.259.732796291
datafile 5 switched to datafile copy
input datafile copy RECID=37 STAMP=736434316 file name=/data/oradata/stdby/datafile/undotbs2.264.732796603
Finished Duplicate Db at 30-NOV-10

Not related to the title but for single-single physical standby through active database duplication, assuming primary is ent11g2, on primary
alter system set log_archive_config='dg_config=(ent11g2,stdby)' scope=both;
alter system set log_archive_dest_1='location=use_db_recovery_file_dest valid_for=(all_logfiles,all_roles) db_unique_name=ent11g2' scope=both;
alter system set log_archive_dest_2='service=STDBYTNS LGWR ASYNC NOAFFIRM max_failure=10 max_connections=5 reopen=180 valid_for=(online_logfiles,primary_role) db_unique_name=stdby' scope=both;
alter system set log_archive_dest_state_1='enable' scope=both;
alter system set log_archive_dest_state_2='enable' scope=both;
alter system set fal_server='STDBYTNS' scope=both;
alter system set fal_client='PRIMARYTNS' scope=both;
alter system set log_archive_max_processes=10 scope=both;
alter system set db_file_name_convert='/data/oradata/stdby/','/data/oradata/ENT11G2' scope=spfile;
alter system set log_file_name_convert='/data/flash_recovery/STDBY','/data/flash_recovery/ENT11G2' scope=spfile;
alter system set standby_file_management='AUTO' scope=both;
RMAN command for duplication for standby
duplicate target database for standby from active database
spfile
parameter_value_convert 'ent11g2','stdby','ENT11G2','stdby'
set db_unique_name='stdby'
set db_file_name_convert='/ENT11G2','/stdby'
set log_file_name_convert='/ENT11G2','/stdby'
set control_files='/data/oradata/stdby/controlfile/control01.ctl'
set log_archive_max_processes='5'
set fal_client='STDBYTNS'
set fal_server='PRIMARYTNS'
set standby_file_management='AUTO'
set log_archive_config='dg_config=(ent11g2,stdby)'
set log_archive_dest_2='service=PRIMARYTNS LGWR ASYNC NOAFFIRM max_failure=10 max_connections=5 reopen=180 valid_for=(online_logfiles,primary_role) db_unique_name=ent11g2';