Wednesday, February 23, 2011

11gR2 RAC to RAC Switchover

Versions prior to 11gR2 (10gR1, 10gR2 and 11gR1) all required only one primary and one standby instance to be online during the switchover. But in 11gR2 it seem this behavior has been changed.

From 10gR1 Dataguard and concepts documentation "For a Real Application Clusters database, only one primary instance and one standby instance can be online during the switchover. Shut down all other instances before starting the switchover. Then, after the switchover completes, bring these instances back online."

From 10gR2 documentation "Verify that all but one RAC instance on the standby databases in a Real Application Clusters configuration are shut down. For a Real Application Clusters database, only one RAC instance on the standby database can be online during the role transition. Shut down all other instances before starting the role transition. Then, after the role transition completes, bring these instances back online. Even though only one RAC instance on the standby database is open during the switchover, all other standby database instances will automatically undergo a transition to their new role correctly when they are opened." This doesn't say anything about the primary RAC but it's same as 10gR1.

From 11gR1 documentation "Before performing a switchover from an Oracle RAC primary database to a physical standby database, shut down all but one primary database instance. Any primary database instances shut down at this time can be started after the switchover completes.
Before performing a switchover or a failover to an Oracle RAC physical standby database, shut down all but one standby database instance. Any standby database instances shut down at this time can be restarted after the role transition completes.
"

Now the 11gR2 documentation "Before performing a switchover from an Oracle RAC primary database to a physical standby database, shut down all but one primary database instance. Any primary database instances shut down at this time can be started after the switchover completes.

Before performing a switchover to a physical standby database that is in real-time query mode, consider bringing all instances of that standby database to the mounted but not open state to achieve the fastest possible role transition and to cleanly terminate any user sessions connected to the physical standby database prior to the role transition.
" It seems in 11gR2 if active dataguard is on (another word for real time query mode) then it must be stopped before the switchover but it is possible to keep ALL instances in the mount mode during the switchover.

Dataguard used here is the one created earlier.

First the switchover was carried out as per 11gR1 documentation with only one primary and one standby active during switchover. This worked fine as expected.

Next the switchover was carried out with all the standby instance in the mount state. They weren't in real time query mode so this setup is similar to a traditional physical standby setup where standby databases are in the mount mode. Steps below are from this second scenario.

Here rac11g2 is the primary db with two instances and rac11g2s is the standby database also with two instances.

1. On the primary stop all but one instance (as stated by 11gR2 documentation)
srvctl stop instance -d rac11g2 -i rac11g21
2. SQL Plus in to online instance and query the switchover status
SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;

SWITCHOVER_STATUS
-----------------
TO STANDBY
3. If the switchover status is to_standby then
SQL> alter database commit to switchover to physical standby;
If the switchover status is session_active then
SQL> alter database commit to switchover to physical standby with session shutdown;
Once the command finishes following could be seen on the instance's alert log
standby redo logfiles that match the primary database:
ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl5.f' SIZE 52428800;
Archivelog for thread 2 sequence 95 required for standby recovery
Switchover: Primary controlfile converted to standby controlfile succesfully.
Switchover: Complete - Database shutdown required
Completed: alter database commit to switchover to physical standby
4. Shutdown the database and change cluster properties of the database to physical standby and default startup mode to mount and start the "new" standby databse (old primary)
SQL> shutdown abort

srvctl modify database -d rac11g2 -r physical_standby -s mount
srvctl start database -d rac11g2
Redo apply process could be started now or wait until the standby switchover to primary role

5. On the "old" standby (one that's going to be the new primary) check the switchover status
SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;

SWITCHOVER_STATUS
-----------------
TO PRIMARY
Since both instances were left in mount mode this could be run in any instance and switchover status would be the same.

6. If the switchover status is to_primary then then run the following command on only one instnace
SQL> alter database commit to switchover to primary;
If the switchover status is session_active then (again only on one instance)
SQL> alter database commit to switchover to primary with session shutdown;
On the alert log of the instnace the command was run
Wed Feb 23 15:00:08 2011
alter database commit to switchover to primary
ALTER DATABASE SWITCHOVER TO PRIMARY (rac11g2s1)
Maximum wait for role transition is 15 minutes.
Switchover: Media recovery is still active
Role Change: Canceling MRP - no more redo to apply
Wed Feb 23 15:00:09 2011
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /opt/app/oracle/diag/rdbms/rac11g2s/rac11g2s1/trace/rac11g2s1_pr00_9833.trc:
ORA-16037: user requested cancel of managed recovery operation
Wed Feb 23 15:00:09 2011
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Wed Feb 23 15:00:09 2011
MRP0: Background Media Recovery process shutdown (rac11g2s1)
Role Change: Canceled MRP
Backup controlfile written to trace file /opt/app/oracle/diag/rdbms/rac11g2s/rac11g2s1/trace/rac11g2s1_ora_14026.trc
SwitchOver after complete recovery through change 3671973
Online log +DATA/rac11g2s/onlinelog/group_1.265.743447741: Thread 1 Group 1 was previously cleared
Online log +FLASH/rac11g2s/onlinelog/group_1.258.743447741: Thread 1 Group 1 was previously cleared
Online log +DATA/rac11g2s/onlinelog/group_2.266.743447743: Thread 1 Group 2 was previously cleared
Online log +FLASH/rac11g2s/onlinelog/group_2.259.743447743: Thread 1 Group 2 was previously cleared
Online log +DATA/rac11g2s/onlinelog/group_3.267.743447745: Thread 2 Group 3 was previously cleared
Online log +FLASH/rac11g2s/onlinelog/group_3.260.743447745: Thread 2 Group 3 was previously cleared
Online log +DATA/rac11g2s/onlinelog/group_4.268.743447747: Thread 2 Group 4 was previously cleared
Online log +FLASH/rac11g2s/onlinelog/group_4.261.743447747: Thread 2 Group 4 was previously cleared
Standby became primary SCN: 3671971
Switchover: Complete - Database mounted as primary
Completed: alter database commit to switchover to primary
Alert log of the other instance
Wed Feb 23 15:00:09 2011
Managed Standby Recovery not using Real Time Apply
Switchover: Complete - Database mounted as primary
7. Once the above command completes open all the instances with
SQL> ALTER DATABASE OPEN;
Database altered.
8. Change the cluster information of the database from physical standby to primary with
srvctl modify database -d rac11g2s -r  primary -s open
9. Start the log apply process on the standby with
SQL> alter database recover managed standby database using current logfile disconnect;
Database altered.
10. Verify the redo transport either querying the v$archive_log view.