Thursday, June 26, 2014

Recreating Dataguard Broker Configuration After ORA-16816: Incorrect Database Role

Following a failed switchover using dataguard broker the database role information was in a inconsistent state.
DGMGRL> show configuration

Configuration - APDG

  Protection Mode: MaxPerformance
  Databases:
    XA04  - Primary database
      Error: ORA-16816: incorrect database role

    XA04S - Physical standby database
      Error: ORA-16810: multiple errors or warnings detected for the database

Fast-Start Failover: DISABLED

Configuration Status:
ERROR
Even though the dataguard broker still consider the XA04 as the primary database and XA04S as the standby database, the databases themselves have under gone the role change and new role is reflected on the v$database view. From the XA04S which dataguard broker still considers the standby
SQL> select database_role from v$database;

DATABASE_ROLE
----------------
PRIMARY
And from XA04 which dataguard broker considers the primary
SQL> select database_role from v$database;

DATABASE_ROLE
----------------
PHYSICAL STANDBY
It's clear that dataguard broker contains erroneous information.



Solution for this situation is to recreate the dataguard broker. This does not entail dropping the configuration. Dropping the dataguard broker would remove all data guard related configuration parameters from the spfile. To recreate the dataguard broker remove the dataguard broker configuration files using a OS utility (rm, delete) if the files are stored in the OS, or using ASMCMD rm if they are stored in the ASM (as in the case of RAC).
Once the configuration files are removed recreate the dataguard broker again by connecting to the new primary database instance.

Useful metalink notes
Step By Step How to Recreate Dataguard Broker Configuration [ID 808783.1]
Unable To Recreate Data Guard Fast Start Failover Configuration With DGMGRL [ID 454418.1]

Tuesday, June 24, 2014

ORA-30012: undo tablespace does not exist or of wrong type

The following error message was observed on the alert logs of all but one instance trying to start them after a switchover using dataguard broker.
Undo initialization errored: err:30012 serial:0 start:568825214 end:568825224 diff:10 (0 seconds)
Errors in file /opt/app/oracle/diag/rdbms/xa04/xa04s2/trace/xa04s2_ora_19364.trc:
ORA-30012: undo tablespace 'UNDOTBS8' does not exist or of wrong type
Errors in file /opt/app/oracle/diag/rdbms/xa04/xa04s2/trace/xa04s2_ora_19364.trc:
ORA-30012: undo tablespace 'UNDOTBS8' does not exist or of wrong type
Error 30012 happened during db open, shutting down database
USER (ospid: 19364): terminating the instance due to error 30012
Instance terminated by USER, pid = 19364
ORA-1092 signalled during: ALTER DATABASE OPEN /* db agent *//* {3:48375:37745} */...
opiodr aborting process unknown ospid (19364) as a result of ORA-1092
Tue Jun 24 00:41:56 2014
ORA-1092 : opitsk aborting process
Both primary and standby databases are RACs in a dataguard configuration and the problem was happening on the new primary (old standby).
Checking the tablespace and datafile views shows the presence of the above mentioned tablespace and the undo data file. Also each instance has its own tablespace assigned to it, as it should be on a RAC environment.
xa04s1.undo_tablespace='UNDOTBS7'
xa04s2.undo_tablespace='UNDOTBS8'
xa04s3.undo_tablespace='UNDOTBS9'
xa04s4.undo_tablespace='UNDOTBS10'
xa04s5.undo_tablespace='UNDOTBS11'
The undo management was already set to auto when the dataguard configuration was created. Yet it was not possible to start the instances (xa04s2-xa04s5).



There was no data guard specific information on this error but MOS note 1344944.1 shows how to create an undo tablespace when the same error occurs for newly added RAC instance. Following this note new undo tablespaces were created for each of the instances after which it was possible to bring up all the instances. But it is not clear as to why this error occurred when each instance was assigned a specific undo tablespace and that file is physically present in the system.

Useful metalink notes
How to create undo tablespace for a newly added RAC instance (ORA-30012) [ID 1344944.1]
ORA-30012 Database Does Not Start With UNDO_MANAGEMENT=AUTO [ID 258506.1]

Friday, June 6, 2014

Skipping Tablespace During Backup and Recovery

Oracle provides skip and exclude options to keep a tablespace(s) out of a full database backup. However when the restore happens this exclusion must be handled before the database could be used by the application, and this handling depends on what effect this exclusion has on the application data structures.
Scenario given in this post shows where a table is using two tablespaces for storage, while the table remain in one tablespace the lob column related segments of that table are stored in a different tablespace. This type of table configuration is used to store transient data such as application session information. The session id is stored in a 2k tablespace while the session data (java serialized objects) stored in a 32kB tablespace. However for the testing 8k block size tablespaces are used for both table and lob segments. Database version is 11.2.0.3.
Set up the test case by creating the two tablespaces. Here the excbackup tablespace will be excluded from the backup.
create tablespace EXCBACKUP DATAFILE size 10m;
create tablespace INCBACKUP DATAFILE size 10m;

alter user asanga quota unlimited on excbackup quota unlimited on incbackup;
Create the table and populate it with data. The disable storage in row is important as the inserted data is less than 4k without disabling storage in row the lob segment will also get stored in the inbackup tablespace.
create table y (a number, b blob) tablespace incbackup lob(b) store as lobseg (tablespace excbackup DISABLE STORAGE IN ROW);
 
 begin
    for i in 1 .. 100
    loop
    insert into y values (i,utl_raw.cast_to_raw('ahgahgaashaghag'));
    end loop;
    end;
    /

select segment_name,tablespace_name,bytes from user_segments;

SEGMENT_NAME                   TABLESPACE_NAME                     BYTES
------------------------------ ------------------------------ ----------
Y                              INCBACKUP                           65536
SYS_IL0000105332C00002$$       EXCBACKUP                          131072
LOBSEG                         EXCBACKUP                         8388608
It could be seen that lob segments are stored in the excbackup tablespace.
Configure RMAN such that excbackup tablespace is excluded when a full database backup is done.
RMAN> CONFIGURE EXCLUDE FOR TABLESPACE excbackup;

using target database control file instead of recovery catalog
Tablespace EXCBACKUP will be excluded from future whole database backups
new RMAN configuration parameters are successfully stored

RMAN> show exclude;

RMAN configuration parameters for database with db_unique_name FGACDB are:
CONFIGURE EXCLUDE FOR TABLESPACE 'EXCBACKUP';
Once the configuration is done run a full database backup including archive logs. The truncated output below shows that datafile belonging to excbackup is not part of the backup
RMAN> backup database plus archivelog delete all input;
...
...
archived log file name=/opt/app/oracle/fast_recovery_area/FGACDB/archivelog/2014_06_06/o1_mf_1_7_9s3s0jxs_.arc RECID=62 STAMP=849545633
Finished backup at 06-JUN-14

Starting backup at 06-JUN-14
using channel ORA_DISK_1
file 5 is excluded from whole database backup
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00012 name=/opt/app/oracle/oradata/FGACDB/datafile/o1_mf_cp_9rs1nq2y_.dbf
input datafile file number=00013 name=/opt/app/oracle/oradata/FGACDB/datafile/o1_mf_dumps_9rs1pg40_.dbf
input datafile file number=00014 name=/opt/app/oracle/oradata/FGACDB/datafile/o1_mf_cms_9rs1pgkf_.dbf
input datafile file number=00015 name=/opt/app/oracle/oradata/FGACDB/datafile/o1_mf_monitor_9rs1ph0s_.dbf
input datafile file number=00017 name=/opt/app/oracle/oradata/FGACDB/datafile/o1_mf_cachetbs_9rs1rjlm_.dbf
input datafile file number=00001 name=/opt/app/oracle/oradata/FGACDB/datafile/o1_mf_system_9rs1rll5_.dbf
input datafile file number=00002 name=/opt/app/oracle/oradata/FGACDB/datafile/o1_mf_sysaux_9rs1rm61_.dbf
input datafile file number=00003 name=/opt/app/oracle/oradata/FGACDB/datafile/o1_mf_undotbs1_9rs1smgo_.dbf
input datafile file number=00004 name=/opt/app/oracle/oradata/FGACDB/datafile/o1_mf_users_9rs1ss96_.dbf
input datafile file number=00006 name=/opt/app/oracle/oradata/FGACDB/datafile/o1_mf_incbacku_9s3rrzr3_.dbf
channel ORA_DISK_1: starting piece 1 at 06-JUN-14
channel ORA_DISK_1: finished piece 1 at 06-JUN-14
piece handle=/opt/app/oracle/fast_recovery_area/FGACDB/backupset/2014_06_06/o1_mf_nnndf_TAG20140606T165402_9s3s0tyd_.bkp tag=TAG20140606T165402 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:35
Finished backup at 06-JUN-14
...
...
Next step is the restore and recovery.



To simulate a situation where all data files are lost, database was shutdown and all data files removed (using OS utility) before the restore. The restore steps are given below.
RMAN> startup nomount;
RMAN> restore controlfile from autobackup;
RMAN> alter database mount;
RMAN> restore database;
....
....
file 5 is excluded from whole database backup
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00001 to /opt/app/oracle/oradata/FGACDB/datafile/o1_mf_system_9rs1rll5_.dbf
channel ORA_DISK_1: restoring datafile 00002 to /opt/app/oracle/oradata/FGACDB/datafile/o1_mf_sysaux_9rs1rm61_.dbf
channel ORA_DISK_1: restoring datafile 00003 to /opt/app/oracle/oradata/FGACDB/datafile/o1_mf_undotbs1_9rs1smgo_.dbf
channel ORA_DISK_1: restoring datafile 00004 to /opt/app/oracle/oradata/FGACDB/datafile/o1_mf_users_9rs1ss96_.dbf
channel ORA_DISK_1: restoring datafile 00006 to /opt/app/oracle/oradata/FGACDB/datafile/o1_mf_incbacku_9s3rrzr3_.dbf
channel ORA_DISK_1: restoring datafile 00012 to /opt/app/oracle/oradata/FGACDB/datafile/o1_mf_cp_9rs1nq2y_.dbf
channel ORA_DISK_1: restoring datafile 00013 to /opt/app/oracle/oradata/FGACDB/datafile/o1_mf_dumps_9rs1pg40_.dbf
channel ORA_DISK_1: restoring datafile 00014 to /opt/app/oracle/oradata/FGACDB/datafile/o1_mf_cms_9rs1pgkf_.dbf
channel ORA_DISK_1: restoring datafile 00015 to /opt/app/oracle/oradata/FGACDB/datafile/o1_mf_monitor_9rs1ph0s_.dbf
channel ORA_DISK_1: restoring datafile 00017 to /opt/app/oracle/oradata/FGACDB/datafile/o1_mf_cachetbs_9rs1rjlm_.dbf
channel ORA_DISK_1: reading from backup piece /opt/app/oracle/fast_recovery_area/FGACDB/backupset/2014_06_06/o1_mf_nnndf_TAG20140606T165402_9s3s0tyd_.bkp
channel ORA_DISK_1: piece handle=/opt/app/oracle/fast_recovery_area/FGACDB/backupset/2014_06_06/o1_mf_nnndf_TAG20140606T165402_9s3s0tyd_.bkp tag=TAG20140606T165402
....
....
Again the skipping of the file belonging to excbackup tablespace is mentioned during the restore. Since the backupset used here doesn't contain the excbackup it is possible to use the restore database command. However if the restore process is using a backupset that has a particular tablespace and it need to be excluded then use RESTORE DATABASE SKIP TABLESPACE command.
Unlike the restore command the recover database will not work
RMAN> recover database;

Starting recover at 06-JUN-14
using channel ORA_DISK_1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 06/06/2014 16:58:43
RMAN-06094: datafile 5 must be restored
Recovery will complain the excluded file must be restored. To overcome this use recover command with skip tablespace. The tablespace name must be given in upper case.
RMAN> recover database skip tablespace 'excbackup'; # failure due to tablespace name being specified in lower case

Starting recover at 06-JUN-14
using channel ORA_DISK_1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 06/06/2014 16:59:00
RMAN-06094: datafile 5 must be restored

RMAN> recover database skip tablespace 'EXCBACKUP';

Executing: alter database datafile 5 offline
starting media recovery

archived log for thread 1 with sequence 8 is already on disk as file /opt/app/oracle/fast_recovery_area/FGACDB/onlinelog/o1_mf_4_86mco7kb_.log
archived log for thread 1 with sequence 9 is already on disk as file /opt/app/oracle/fast_recovery_area/FGACDB/onlinelog/o1_mf_1_86mcp3kk_.log
archived log file name=/opt/app/oracle/fast_recovery_area/FGACDB/onlinelog/o1_mf_4_86mco7kb_.log thread=1 sequence=8
archived log file name=/opt/app/oracle/fast_recovery_area/FGACDB/onlinelog/o1_mf_1_86mcp3kk_.log thread=1 sequence=9
media recovery complete, elapsed time: 00:00:01
Finished recover at 06-JUN-14
Finally open the database with reset logs
RMAN> alter database open resetlogs;
At this stage the database is open and could be used as long as the segments that were in the excluded tablespace are not needed. All table data that were part of the incbackup tablespace could be access without any error.
SQL> select a from y;

         A
----------
         1
         2
         3
         4
         5
Any attempt to access the blob segment which was in the excluded tablespace results in an error
SQL> select b from y;
ERROR:
ORA-00376: file 5 cannot be read at this time
ORA-01110: data file 5:
'/opt/app/oracle/oradata/FGACDB/datafile/o1_mf_excbacku_9s3rryro_.dbf'
It is not possible to delete or truncate the table either
SQL> delete from y;
delete from y
            *
ERROR at line 1:
ORA-00376: file 5 cannot be read at this time
ORA-01110: data file 5:
'/opt/app/oracle/oradata/FGACDB/datafile/o1_mf_excbacku_9s3rryro_.dbf'


SQL> truncate table y;
truncate table y
               *
ERROR at line 1:
ORA-00376: file 5 cannot be read at this time
ORA-01110: data file 5:
'/opt/app/oracle/oradata/FGACDB/datafile/o1_mf_excbacku_9s3rryro_.dbf'
During this time the table segment information is listed as below.
SQL> select segment_name,tablespace_name,bytes from user_segments;

SEGMENT_NAME               TABLESPACE_NAME                     BYTES
-------------------------- ------------------------------ ----------
Y                          INCBACKUP                           65536
SYS_IL0000105365C00002$$   EXCBACKUP                           65536
LOBSEG                     EXCBACKUP
Dropping the table doesn't remove the lob segments either (it could be that these don't occupy any space but entry is there under a different name)
SQL> drop table y purge;

SQL>  select segment_name,tablespace_name,bytes from user_segments;

SEGMENT_NAME   TABLESPACE_NAME                     BYTES
-------------- ------------------------------ ----------
5.130          EXCBACKUP
5.138          EXCBACKUP                           65536
It is also possible to drop the missing data file offline and add another data file to the tablespace and recreate the table.
SQL>  alter database   datafile '/opt/app/oracle/oradata/FGACDB/datafile/o1_mf_excbacku_9s3rryro_.dbf' offline drop;

SQL> alter tablespace excbackup add DATAFILE size 10m;

SQL> create table y (a number, b blob) tablespace incbackup lob(b) store as lobseg (tablespace excbackup DISABLE     STORAGE IN ROW);


 begin
    for i in 1 .. 100
    loop
    insert into y values (i,utl_raw.cast_to_raw('ahgahgaashaghag'));
    end loop;
    end;
    /
But still the segment information from the earlier table remains.
SQL> select segment_name,tablespace_name,bytes from user_segments;

SEGMENT_NAME               TABLESPACE_NAME                     BYTES
-------------------------- ------------------------------ ----------
Y                          INCBACKUP                           65536
SYS_IL0000105337C00002$$   EXCBACKUP                           65536
LOBSEG                     EXCBACKUP                          983040
5.130                      EXCBACKUP
5.138                      EXCBACKUP
Only way to get rid of these phantom segment information is to drop the tablespace and recreate it again.
Therefore to have a clean restore and recovery in this type of scenario best course of actions is, once the database is open, drop the table, drop the tablespace and then recreate the tablespace and the table as before.