Monday, February 18, 2013

Upgrading RHEL 6 OS in a 11gR2 RAC Environment

This post upgrade the RHEL 6 OS on severs which are running 11gR2 RAC. Current kernel is
$ uname -r
and the database patch level is
opatch lsinventory -local | grep Patch
Oracle Interim Patch Installer version
OPatch version    :
Patch  14275605     : applied on Thu Nov 15 14:21:27 GMT 2012
Unique Patch ID:  15379762
Patch description:  "Database Patch Set Update : (14275605)"
Sub-patch  13923374; "Database Patch Set Update : (13923374)"
Sub-patch  13696216; "Database Patch Set Update : (13696216)"
Sub-patch  13343438; "Database Patch Set Update : (13343438)"
Patch  14275572     : applied on Thu Nov 15 14:20:32 GMT 2012
Unique Patch ID:  15379762
Patch description:  "Grid Infrastructure Patch Set Update : (14275572)"
OPatch succeeded.
The database software is running standard edition.
The steps are similar to that of upgrading RHEL 5 OS running 11gR2 RAC with few minor differences. This RAC system does not use ASMLibs.
1. Current kernel version
$ uname -r
2. Stop all the cluster components
crsctl stop cluster -all
and stop disable start of crs after reboot
crsctl stop crs
crsctl disable crs
3. In RHEL 6 inittab is depreciated and the spawning process file is located in /etc/init with the name oracle-ohasd.conf
More on this is available on Troubleshoot Grid Infrastructure Startup Issues [ID 1050908.1] If Oracle is comment out the trace file analyzer script as well oracle-tfa.conf.
cat oracle-ohasd.conf
# Copyright (c) 2001, 2011, Oracle and/or its affiliates. All rights reserved.
# Oracle OHASD startup

start on runlevel [35]
stop  on runlevel [!35]
exec /etc/init.d/init.ohasd run >/dev/null 2>&1 </dev/null
As root comment last two lines
#exec /etc/init.d/init.ohasd run >/dev/null 2>&1 </dev/null
4. Upgrade the OS (only few screenshots shown)

5. New kernel version after upgrade
uname -r
6. Stop oracle-ohasd
initctl stop oracle-ohasd
oracle-ohasd stop/waiting
7. Un-comment the last two lines on /etc/init/oracle-ohasd that was commented earlier
start on runlevel [35]
stop  on runlevel [!35]
exec /etc/init.d/init.ohasd run >/dev/null 2>&1 </dev/null
8. Enable crs on startup
crsctl enable crs
CRS-4622: Oracle High Availability Services autostart is enabled.
9. Relink the Oracle Home binaries as the oracle software owner
which relink
$ cd $ORACLE_HOME/bin
$ ./relink all
writing relink log to: /opt/app/oracle/product/11.2.0/dbhome_1/install/relink.log
The relink will end with following on the log
test ! -f /opt/app/oracle/product/11.2.0/dbhome_1/bin/oracle ||\
           mv -f /opt/app/oracle/product/11.2.0/dbhome_1/bin/oracle /opt/app/oracle/product/11.2.0/dbhome_1/bin/oracleO
mv /opt/app/oracle/product/11.2.0/dbhome_1/rdbms/lib/oracle /opt/app/oracle/product/11.2.0/dbhome_1/bin/oracle
chmod 6751 /opt/app/oracle/product/11.2.0/dbhome_1/bin/oracle

9. Unlock the permission on the grid software as root
# cd $GI_HOME/crs/install

# perl -unlock
Using configuration parameter file: ./crsconfig_params
CRS-4544: Unable to connect to OHAS
CRS-4000: Command Stop failed, or completed with errors.
Successfully unlock /opt/app/11.2.0/grid
10. As the grid software owner (gird in this case) relink the grid software
[grid@rhel6m1 ~]$ export ORACLE_HOME=$GI_HOME
[grid@rhel6m1 ~]$ export PATH=$ORACLE_HOME/bin:$PATH
[grid@rhel6m1 ~]$ which relink
[grid@rhel6m1 ~]$ $GI_HOME/bin/relink
writing relink log to: /opt/app/11.2.0/grid/install/relink.log
The relink ends with
test ! -f /opt/app/11.2.0/grid/bin/oracle ||\
           mv -f /opt/app/11.2.0/grid/bin/oracle /opt/app/11.2.0/grid/bin/oracleO
mv /opt/app/11.2.0/grid/rdbms/lib/oracle /opt/app/11.2.0/grid/bin/oracle
chmod 6751 /opt/app/11.2.0/grid/bin/oracle
11. Run the post relink script. -patch takes a while to complete and ends with an error as shown below
# cd $GI_HOME/rdbms/install
# ./

# cd $GI_HOME/crs/install
# perl -patch
Using configuration parameter file: ./crsconfig_params
CRS-4124: Oracle High Availability Services startup failed.
CRS-4000: Command Start failed, or completed with errors.
Failed to write the checkpoint:'' with status:FAIL.Error code is 256
Oracle Grid Infrastructure stack start initiated but failed to complete at line 11545.
After an SR it was confirmed that this error is ignorable and crs could be started with "init 3" or by rebooting the server. Reason that above error happen is that since "ohasd run" line was commented on oracle-ohasd file there's no "ohasd run" process active when the kernel is upgraded and server is rebooted. Without "ohasd run" active the crsctl start crs (issued during the -patch) will fail to start the clusterware stack. This has no effect on the actual relink. This is also mentioned in 1050908.1
For metalink notes related to Upgrading OS and relinking clusterware and oracle binaries after OS upgrade refer the related post given below.

Related Posts
Upgrading OS in 11gR2 RAC Environment (RHEL 5)
Upgrading ASMLib and OS in 11gR1 RAC Environment

Useful metalink notes
Executing "relink all" resets permission of extjob, jssu, oradism, externaljob.ora [ID 1555453.1]
Relinking Oracle Home FAQ ( Frequently Asked Questions) [ID 1467060.1]