Friday, September 14, 2018

ORA-28086: The data redaction policy expression has an error When Cursor Sharing is Force

Adding a policy using dbms_redact on 11.2.0.4 fails when the cursor sharing is force. The latest PSU available for the DB version has already been applied to the DB. But this didn't resolve the issue. The current patch applied
SQL> select comments from dba_registry_history;

COMMENTS
----------------------------------------
PSU 11.2.0.4.180717
Cursor sharing mode
SQL> show parameter cursor_sharing

NAME            TYPE     VALUE
--------------- -------- ---------
cursor_sharing  string   FORCE
Adding a policy using dbms_redact fails with following error
SQL>  BEGIN
  2    DBMS_REDACT.add_policy(
  3      object_schema => 'asanga',
  4      object_name   => 'cc_details',
  5      column_name   => 'cc_no',
  6      policy_name   => 'redact_card_info',
  7      function_type => DBMS_REDACT.full,
  8      expression  => 'SYS_CONTEXT(''TOP_SEC'',''APP_USER'') <> ''asanga'' or SYS_CONTEXT(''TOP_SEC'',''APP_USER'') IS NULL'
  9    );
 10  END;
 11  /
 BEGIN
*
ERROR at line 1:
ORA-28086: The data redaction policy expression has an error.
ORA-28087: The policy expression has an unsupported (use of) operator 'SYS_CONTEXT'.
ORA-06512: at "SYS.DBMS_REDACT_INT", line 3
ORA-06512: at "SYS.DBMS_REDACT", line 42
ORA-06512: at line 2
As a workaround change the cursor sharing to "exact" and run the add policy.



If cursor sharing mode cannot be changed apply the patch 20693579 which allows DBMS_REDACT.add_policy work even when cursor sharing is set to force.

Related Metalink Notes
Bug 20693579 DBMS_REDACT.ADD_POLICY fails with ORA-28086 when CURSOR_SHARING is force [ID 20693579.8]
Data Redaction Package DBMS_REDACT not valid after upgrading to 11.2.0.4 [ID 1945055.1]
Create Data Redaction Policy returns error ORA-06521: PL/SQL: Error mapping function, ORA-06512: at "SYS.DBMS_REDACT_INT", line 3 [ID 2222699.1]

Saturday, September 8, 2018

DG Broker Continue With Status "ROLLING DATABASE MAINTENANCE IN PROGRESS" Even After ROLLBACK_PLAN is Called

With Oracle 12c a new PL/SQL package called DBMS_ROLLING was provided which greatly automate the rolling upgrade of databases with minimum down time and maximum protection. Refer MOS 2086512.1 and the white paper from MAA. Using of DBMS_ROLLING requires active data guard license option. DBMS_ROLLING could only be used on versions 12.1.0.2 or above. For other versions use transient logical standby method for upgrade.
The database version used for this post is 12.2.0.1. Prior releases required disabling DG broker when dbms_rolling was used. But there's improvements in 12.2 version according to Oracle documentaion. As of Oracle Database 12c Release 2 (12.2.0.1), Data Guard broker can remain on during a DBMS_ROLLING rolling upgrade; there is no longer any need to disable it.
Broker support is enabled by default during execution of the DBMS_ROLLING.BUILD_PLAN procedure if the broker is enabled at the time of the call. When broker support is enabled, the broker sets up the redo transport destinations as necessary from the original primary database as well as from the rolling upgrade target, manages instance count on the upgrade target if it is an Oracle RAC database, and notifies Oracle Clusterware and Global Data Services as appropriate during the course of the rolling upgrade.
Broker support can be manually controlled using the DBMS_ROLLING parameter, DG BROKER. Although role transitions are typically performed using the broker, the switchover step in a rolling upgrade should continue to be performed using the DBMS_ROLLING.SWITCHOVER procedure.

Further more the broker admin doc states "When an upgrade is performed using the PL/SQL package, DBMS_ROLLING, the DG_BROKER_CONFIG_FILEn parameters must specify a location outside of Oracle home. (The default location is the dbs directory in Oracle home.)"
The current data guard configuration consists of two databases and DG broker status is shown below.
DGMGRL> show configuration

Configuration - dg12c2

  Protection Mode: MaxAvailability
  Members:
  colombo - Primary database
    london  - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 44 seconds ago)
The supplement log status is as below.
select SUPPLEMENTAL_LOG_DATA_ALL,SUPPLEMENTAL_LOG_DATA_MIN,
SUPPLEMENTAL_LOG_DATA_PK,SUPPLEMENTAL_LOG_DATA_UI,
SUPPLEMENTAL_LOG_DATA_FK,SUPPLEMENTAL_LOG_DATA_PL 
from v$database;

SUP SUPPLEME SUP SUP SUP SUP
--- -------- --- --- --- ---
NO  NO       NO  NO  NO  NO
In previous version after the dbms_rolling upgrade supplement log setting had to be manually rolled back. But with 12.2 this was not necessary as seen later in the post.
This post shows steps up to starting the upgrade plan and then rolling back. In the critical path after the start plan one would move onto switchover. Rollback cannot be called if switchover has happened (Refer page 5 on white paper mentioned above).
1. To begin the upgrade using dbms_rolling put the standby in mount mode and run init_plan. As there's only two databases in the configuration, the only standby is made to be the future primary.
SQL> exec DBMS_ROLLING.INIT_PLAN(future_primary=>'london');
PL/SQL procedure successfully completed.
Check the plan parameter status with
select scope, name, curval from dba_rolling_parameters order by scope, name;
2. Next execute the build plan and check the plan values.
SQL> exec DBMS_ROLLING.BUILD_PLAN;
PL/SQL procedure successfully completed.

SELECT instid, target, phase, description FROM DBA_ROLLING_PLAN;
3. Run the start plan to begin the conversion of the physical standby to a transient logical standby.
SQL> exec DBMS_ROLLING.START_PLAN;
PL/SQL procedure successfully completed.
This step does following things.

i. Backing up the control file for each database to a trace file. These could be verified on the alert logs of primary and standby. On primary alert log
RTS(17286): executing rolling upgrade instruction 26 from plan revision 1
alter database backup controlfile to 'rolling_change_backup.f' reuse
Completed: alter database backup controlfile to 'rolling_change_backup.f' reuse
On standby alert log
RTS(2699): executing rolling upgrade instruction 28 from plan revision 1
alter database backup controlfile to 'rolling_change_backup.f' reuse
Completed: alter database backup controlfile to 'rolling_change_backup.f' reuse
ii. Creating flashback database guaranteed restore points both on primary and standby. Alert logs entries similar to below could be seen on both databases.
2018-08-16T06:32:40.016320-04:00
RTS(17286): executing rolling upgrade instruction 22 from plan revision 1
Created guaranteed restore point DBMSRU_INITIAL
iii. Building a LogMiner dictionary at the primary database. Following entries could be seen on primary alert log
Thu Aug 16 06:32:49 2018
Logminer Bld: Done
LOGMINER: PLSQL Quiesce mode enabled
LOGMINER: Dictionary Build: Waiting for completion of transactions started at or before scn 11927360 (0x0000000000b5ff40)
LOGMINER: Dictionary Build: All transactions started at or before scn 11927360 (0x0000000000b5ff40) have completed
2018-08-16T06:32:49.308962-04:00
LOGSTDBY: Dictionary Build PLSQL Quiesce - SID: 2268, wait_init: 300 sec, wait_timeout: 3600 sec.
iv. Recovering the designated physical standby into a transient logical standby database. Following could be seen on standby alert log.
RECOVER TO LOGICAL STANDBY: Complete - Database mounted as logical standby
Completed: alter database recover to logical standby keep identity
At this stage the DG broker will show standby as a transient logical standby database and give a warning status as rolling database maintenance is active. Importantly (reason for this post) the DG broker status says "ROLLING DATABASE MAINTENANCE IN PROGRESS".
DGMGRL> show configuration

Configuration - dg12c2

  Protection Mode: MaxAvailability
  Members:
  colombo - Primary database
    london  - Transient logical standby database

Fast-Start Failover: DISABLED

Configuration Status:
    ROLLING DATABASE MAINTENANCE IN PROGRESS


DGMGRL> show database london

Database - london

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      0 seconds (computed 0 seconds ago)
  Apply Lag:          2 seconds (computed 0 seconds ago)
  Average Apply Rate: (unknown)
  Real Time Query:    OFF
  Instance(s):
    london

  Database Warning(s):
    ORA-16866: database converted to transient logical standby database for rolling database maintenance

Database Status:
WARNING
v. Loading the LogMiner dictionary into the logical standby database. Following entries could be seen on standby alert log.
LOGMINER: Begin mining logfile during dictionary load for session 1 thread 1 sequence 267, +FRA/LONDON/foreign_archivelog/colombo/2018_08_16/thread_1_seq_267.365.984292375
LOGMINER: Preparing to load 1 dictionaries for session 1
LOGMINER: Finalizing dictionary load for session 1
LOGMINER: Gathering statistics on logminer dictionary. (incremental, nonparallel)
LOGMINER: Completed dictionary load for session 1
LOGMINER: End mining logfiles during dictionary load for session 1
vi. Configuring LGS databases with the transient logical standby database. As there was no other standbys in the configuration, no leading group standbys (LGS) to configure.



4. Ideally from this point one would upgrade the standby to new DB version and do a switchover and upgrade the trailing group master. But if for some reason rolling back is needed it has to happen before the switchover is done. This post is based on an issue encountered after the rollback. In order to rollback the steps done by dbms_rolling put the standby DB in mount mode and execute the rollback_plan.
SQL> select database_role, open_mode from v$database;

DATABASE_ROLE      OPEN_MODE
----------------   -----------
LOGICAL STANDBY    READ WRITE

SQL> startup force mount;

SQL> select open_mode from v$database;

OPEN_MODE
-----------
MOUNTED

SQL> exec DBMS_ROLLING.ROLLBACK_PLAN;
PL/SQL procedure successfully completed.
5. Rollback command will convert the transient logical standby to a physical standby and also rollback supplemental logging settings. If rollback failed then supplemental logging data setting must be manually cleared.
SQL> select database_role,open_mode from v$database;

DATABASE_ROLE    OPEN_MODE
---------------- --------------
PHYSICAL STANDBY MOUNTED

select SUPPLEMENTAL_LOG_DATA_ALL,SUPPLEMENTAL_LOG_DATA_MIN,
SUPPLEMENTAL_LOG_DATA_PK,SUPPLEMENTAL_LOG_DATA_UI,
SUPPLEMENTAL_LOG_DATA_FK,SUPPLEMENTAL_LOG_DATA_PL 
from v$database;

SUP SUPPLEME SUP SUP SUP SUP
--- -------- --- --- --- ---
NO  NO       NO  NO  NO  NO
6. End the upgrade plan with a call to destroy plan.
SQL> exec DBMS_ROLLING.DESTROY_PLAN;
PL/SQL procedure successfully completed.
7. Though the databases are back to the state they were before the upgrade began, the data guard broker continue to say "ROLLING DATABASE MAINTENANCE IN PROGRESS".
DGMGRL> show configuration

Configuration - dg12c2

  Protection Mode: MaxAvailability
  Members:
  colombo - Primary database

  Members Not Receiving Redo:
  london  - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
    ROLLING DATABASE MAINTENANCE IN PROGRESS
Even though standby is listed as not receiving redo, this is not actually the case. Apply on status is shown on DG broker output and redo apply could be verified by looking at the standby alert log.
DGMGRL> show database london

Database - london

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      0 seconds (computed 0 seconds ago)
  Apply Lag:          0 seconds (computed 0 seconds ago)
  Average Apply Rate: 10.00 KByte/s
  Real Time Query:    OFF
  Instance(s):
    london

Database Status:
SUCCESS
Validate database also does not give any errors.
DGMGRL> validate database london

  Database Role:     Physical standby database
  Primary Database:  colombo

  Ready for Switchover:  Yes
  Ready for Failover:    Yes (Primary Running)

  Managed by Clusterware:
    colombo:  YES
    london :  YES
The reason for DG broker to output an erroneous message is the notifying of the dbms_rolling ending is included only in the finish phase.
SELECT instid, target, phase, description FROM DBA_ROLLING_PLAN;
...
83 colombo    FINISH         Notify Data Guard broker that DBMS_ROLLING has finished
84 london     FINISH         Notify Data Guard broker that DBMS_ROLLING has finished
Rollback and destroy phases has no provisions to notify the DG Broker that dbms_rolling has finished. Due to this DG broker continue to output an incorrect status. As per MOS 2347179.1 this has been identified as a bug (26921039) which is fixed in 18c but no backport available for 12c.



Even though DG broker gives a incorrect status, this doesn't affect the ability to switchover.
Connected to "colombo"
Connected as SYSDG.
DGMGRL> switchover to london
Performing switchover NOW, please wait...
Operation requires a connection to database "london"
Connecting ...
Connected to "london"
Connected as SYSDBA.
New primary database "london" is opening...
Oracle Clusterware is restarting database "colombo" ...
Switchover succeeded, new primary is "london"
DGMGRL> show configuration

Configuration - dg12c2

  Protection Mode: MaxAvailability
  Members:
  london  - Primary database

  Members Not Receiving Redo:
  colombo - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
    ROLLING DATABASE MAINTENANCE IN PROGRESS


DGMGRL> validate database colombo

  Database Role:     Physical standby database
  Primary Database:  london

  Ready for Switchover:  Yes
  Ready for Failover:    Yes (Primary Running)

  Managed by Clusterware:
    london :  YES
    colombo:  YES

  Log Files Cleared:
    london Standby Redo Log Files:   Cleared
    colombo Online Redo Log Files:   Not Cleared
    colombo Standby Redo Log Files:  Available


DGMGRL> switchover to colombo
Performing switchover NOW, please wait...
Operation requires a connection to database "colombo"
Connecting ...
Connected to "colombo"
Connected as SYSDBA.
New primary database "colombo" is opening...
Oracle Clusterware is restarting database "london" ...
Switchover succeeded, new primary is "colombo"
DGMGRL> show configuration

Configuration - dg12c2

  Protection Mode: MaxAvailability
  Members:
  colombo - Primary database
    london  - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 2 seconds ago)
Only workaround mentioned in 2347179.1 to clear this incorrect DG broker status message is to restart the primary which will incur an unnecessary downtime. Testing showed that restarting the primary does clear the incorrect status message. As seen above outputs, even a switchover also does the same (as it results in DB being restarted).
Alternatively recreating the DG broker also cleared the incorrect DG broker status. This seems to be a better option as it doesn't require a down time.

Saturday, September 1, 2018

ORA-01555: snapshot too old and Index Corruption

ORA-1555 in most cases is associated with an undersized undo tablespace (when Automatic Undo Management - AUM is used). Lack of space in undo tablespace result in unexpired undo information to be overwritten resulting in ora-1555 when queries try to get a read consistent view of data. The typical solution is to size the undo tablespace such that it accommodate the longest running queries and undo retention time (Refer here for undo tablespace with auto extend on).
But an under size undo tablespace is not always the case for ora-1555. However the issue manifest with the same error. On the client side as usual ora-1555 is raised as below
java.sql.SQLException: 
ORA-01555: snapshot too old: rollback segment number 8478 with name "_SYSSMU8478_1180928192$" too small
However the key to identify the root cause is the alert log entry associated with the error. In this case several queries failed and alert log had entries similar to following on alert log
ORA-01555 caused by SQL statement below (SQL ID: 194abmtxz7c7t, Query Duration=7 sec, SCN: 0x090d.05f1157b):

ORA-01555 caused by SQL statement below (SQL ID: ahrf09nz2mau1, Query Duration=4 sec, SCN: 0x090d.05f12397):
They query duration in this case was small. This is the key to identifying if the ora-1555 is due to under size undo tablespace or not. When the query duration is 0 or "small" then this is due to either table/index mismatch or an primary key index corruption (Refer MOS notes at the end of the post for more). Some MOS notes indicate the root cause for this corruption is still not identified as issue cannot be easily reproduced.
If the query duration is large then most likely reasons are undersize undo tablespace or LOB segment corruption where undo is stored inside the LOB (1950897.1, 1580790.1).



Solution is to run a analyze table command with validate structure (use online option if running on a system that is in use) to identify corrupted indexes. Querying the user_indexes view for invalid index will not show any problematic indexes. The analyze command will also raise ora-1555 if it encounter a table with corrupted indexes.
In the above situation ora-1555 was encountered in several SQLs with the same set of tables. So analyze command was only run on the tables used in those queries. Three tables cause ora-1555 to be raised during analysis. As the first step index were rebuild and when that didn't resolve the issue indexes were dropped re-created. Recreating the indexes resolved the issue in this case. It is worth mentioning that the database in question (version : 11.2.0.4 EE) had the parameter db_ultra_safe set to data_and_index. Yet this had not prevented the type of corruption which caused the ora-1555 to be raised.
If recreating the index doesn't resolve then next step is to create a new undo tablespace and switch the database to it. Drop the previously used undo tablespace once all undo segments are offline. It must be said that starting the solution by this step of creating a new undo tablespace and dropping the old didn't resolve the issue. So running the analyze command is the key and this could take a while to complete for large tables (there's no quick solution).
If issue didn't resolve even after creating a new undo tablespace, then raise a SR (refer 1682704.1).

Useful Metalink Notes
Master Note for ORA-1555 Errors [ID 1307334.1]
IF: ORA-1555 - Snapshot Too Old Error [ID 1950897.1]
How To Size UNDO Tablespace For Automatic Undo Management [ID 262066.1]
Troubleshooting ORA-01555 - Snapshot Too Old: Rollback Segment Number "String" With Name "String" Too Small [ID 1580790.1]
IF: ORA-1555 Reported with Query Duration = 0 , or a Few Seconds [ID 1950577.1]
ORA-01555 When Max Query Length Is Less Than Undo Retention, small or 0 Seconds [ID 1131474.1]
Primary Key Index Corruption Generates ORA-01555 With Small Query Duration or with Query Duration as 0 Seconds [ID 977902.1]
ORA-1555 Error With Query Duration as 0 Seconds [ID 761128.1]
SRDC - ORA-1555: Query Duration 0: Checklist of Evidence to Supply [ID 1682704.1]