Friday, October 24, 2014

APPEND_VALUES Hint and JDBC

APPEND_VALUE hint was introduced in 11gR2 for direct path inserts with values clause. Append hint is only useful when doing direct path loading with a select sub-query. On Oracle documentation mention append_value hint is useful in enhnacing performance and list OCI program and PL/SQL as example. There's no mention of JDBC. Is there a performance gain when using append_value hint when the inserts are issued through JDBC?
This is a simple test case that tried to answer this question. The test case involves inserting 50000 rows to a single column table. The insert statement is issued with and without (conventional insert) the hint. Also the table is created with and without logging enabled. Append* hints generate less redo when the table has nologging enabled. But this is not entirely dependent on table but also other factors as well. The test measures redo size, cpu used by session and the total elapsed time to insert the rows (this is measured on the side of the java test code) statistics. Test is repeated for above combination employing JDBC batching as well. The java code used for the test case is given at the end of the post. The tests were carried out on 11.2.0.3 EE database. (update 2014/10/24: Same test was done on 12.1.0.2 EE DB. The results more or less the same. In 12.1.0.2 even nologging + batching with append hint didn't out perform same test without the hint)
First comparison is the size of the redo generated during the inserts.
The first graph shows the redo generated without batching inserts. There's no surprise that when the table is in nologging mode the amount of redo generated is less than when table is in logging mode. But not having the append hint (in this post append hint means append_value hint) seems to generate less redo than having the append hint. This is true when table is logging and nologging mode. On the other hand when the inserts are batched and when table is in nologging having the append hint results in minimum redo being generated and this amount is less than the redo generated without the append hint (in both logging and nologging modes). If the table is in logging mode then batching without the append hint results in less redo generated compared to using the append hint.
The next statistic is the CPU used for completing the inserts. The CPU used by this session statistics is used to calculate this by capturing CPU statistics value before and after inserts.
Batching the inserts results in minimum CPU being consumed for inserts compared to not batching. There's no great deal difference in the CPU consumption when the append hint is used compared to it not being used. However when inserts are not batched the amount of CPU consumed is doubled and tripled when append hint is used compared to without the hint. So in terms of CPU consumption, having the append hint and not batching the inserts will result in performance degradation.
Final statistics is the total elapsed time to insert the rows. This is roughly the total execution time for the test code. The time is measured in milliseconds.
Similar to CPU usage, batching results in lowest elapsed time. This is no surprise as CPU is a component of the overall elapsed time. However when inserts are not batched then having the append hint results in high elapsed time compared to inserting without the append hint.
From this limited test case it seems that only time that's beneficial to use append hint with JDBC is when inserts are batched and table is in nologging mode. Other times the non batched inserts and batched inserts out perform the effects of append hint. But having nologging table may not be an option in a production system and even if it is possible have a nologging table, highly concurrent inserts into a nologging table could results high number of control file related wait events.
Furthermore there few other points to consider before deciding to use append hint in the application. When append_value is used to insert to a table another session cannot insert until first session commits. The second session will hang and it will be waiting on a enq: TM - contention wait event which is usually associated with unindexed foreign key related issues. So the concurrent nature of the inserts must be considered. If the inserts are highly concurrent then having the append hint may not be a good idea.
Within the same session after one insert another cannot be made without first committing the previous insert.
SQL> set auto off
SQL>  insert /*+ append_values */ into append_hint_test values ('abc');

1 row created.

SQL> insert /*+ append_values */ into append_hint_test values ('def');
insert /*+ append_values */ into append_hint_test values ('def')
                                 *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel
Therefore java codes that's been written to reuses the cursors and have auto commit set to false will encounter following error
Exception in thread "main" java.sql.SQLException: ORA-12838: cannot read/modify an object after modifying it in parallel
Also the append hint results in direct loading of data to the end of the table. This results in continuous growth of the table even if there's free space available (which may or may not be a problem in some cases). Therefore it maybe better to use traditional batching with JDBC than using append hint as the negative consequence of using it seem to out weigh the gains.



On the other hand in PL/SQL with batch inserts (using FORALL) the append hint seem to out perform the conventional inserts. PL/SQL code used is also given at the end of the post. Below graph shows the redo size generated for inserting 50000 rows with and without append hint.


Create table for the test cases.
create table append_hint_test(value varchar2(50));
create table append_hint_plsql_test(value varchar2(50));
Java Code Used.
public class AppendHintTest {

    public static void main(String[] args) throws Exception {

        OracleDataSource dataSource = new OracleDataSource();
        dataSource.setURL("jdbc:oracle:thin:@192.168.0.66:1521:ent11g2");
        dataSource.setUser("asanga");
        dataSource.setPassword("asa");

        Connection con = dataSource.getConnection();
//               con.setAutoCommit(false);  
        DBStats stats = new DBStats();
        stats.initStats(con);

        String SQL = "insert /*+ APPEND_VALUES */ into append_hint_test values (?)";
//               String SQL = "insert into append_hint_test values (?)"; 

        PreparedStatement pr = con.prepareStatement(SQL);


        long t1 = System.currentTimeMillis();

        for (int i = 0; i < 50000; i++) {
            pr.setString(1, "akhgaipghapga " + i);
            pr.execute();

//                   pr.addBatch();
//                   if(i%1000==0){
//                       pr.executeBatch();

//                   }
        }

//               pr.executeBatch();
        con.commit();
        pr.close();

        long t2 = System.currentTimeMillis();
        String[][] statsValues = stats.getStatsDiff(con);
        con.close();
        System.out.println("time taken " + (t2 - t1));

        for (String[] x : statsValues) {

            System.out.println(x[0] + " : " + x[1]);
        }

    }
}

public class DBStats {

    private HashMap stats = new HashMap<>();
    private String SQL = "select name,value " + "from v$mystat,v$statname " + "where v$mystat.statistic#=v$statname.statistic# "
            + "and v$statname.name in ('CPU used when call started','CPU used by this session','db block gets',"
            + "'db block gets from cache','db block gets from cache (fastpath)','db block gets direct',"
            + "'consistent gets','consistent gets from cache','consistent gets from cache (fastpath)',"
            + "'consistent gets - examination','consistent gets direct','physical reads',"
            + "'physical reads direct','physical read IO requests','physical read bytes',"
            + "'consistent changes','physical writes','physical writes direct',"
            + "'physical write IO requests','physical writes from cache','redo size')";

    public void initStats(Connection con) {
        try {
            PreparedStatement pr = con.prepareStatement(SQL);

            ResultSet rs = pr.executeQuery();


            while (rs.next()) {

                stats.put(rs.getString(1), rs.getLong(2));
            }

            rs.close();
            pr.close();
        } catch (SQLException ex) {
            ex.printStackTrace();
        }

    }

    public String[][] getStatsDiff(Connection con) {
        
        String[][] statDif = new String[stats.size()][2];

        try {
            PreparedStatement pr = con.prepareStatement(SQL);

            ResultSet rs = pr.executeQuery();

            int i = 0;
            while (rs.next()) {
               Long val = rs.getLong(2) - stats.get(rs.getString(1));
               statDif[i][0] = rs.getString(1);
               statDif[i][1] = val.toString();
               i++;
            }

            rs.close();
            pr.close();
        } catch (SQLException ex) {
            ex.printStackTrace();
        }


        return statDif;


    }
}
PL/SQL code used. Before running the PL/SQL test populate the Append_Hint_Test table with rows using java code above.
SET serveroutput ON
DECLARE
Type Arry_Type
IS
  TABLE OF Loadt%Rowtype INDEX BY PLS_INTEGER;
  Loadtable Arry_Type;
  redosize1 NUMBER;
  redosize2 NUMBER;
  t1        NUMBER;
  t2        NUMBER;
Begin
  EXECUTE immediate 'truncate table append_hint_plsql_test';
  Select * Bulk Collect Into Loadtable From Append_Hint_Test ;
  
  Dbms_Output.Put_Line(Loadtable.Count);
  
  SELECT Value
  INTO Redosize1
  FROM V$mystat,
    V$statname
  Where V$mystat.Statistic#=V$statname.Statistic#
  AND V$statname.Name      = 'redo size';
  
  Dbms_Output.Put_Line('redo size 1 '||Redosize1);
  T1          := Dbms_Utility.Get_Time;
  
  Forall Indx IN 1 .. Loadtable.Count
--  INSERT /*+ APPEND_VALUES */
--  INTO append_hint_plsql_test VALUES
--    (Loadtable(Indx).A
--    );
  Insert Into Append_Hint_Plsql_Test Values   (Loadtable(Indx).A   );

  Commit;
  
  T2 := Dbms_Utility.Get_Time;
  
  SELECT Value
  INTO Redosize2
  FROM V$mystat,
    V$statname
  WHERE V$mystat.Statistic#=V$statname.Statistic#
  AND V$statname.Name      = 'redo size';
  Dbms_Output.Put_Line('redo size 2 '||Redosize2);
  Dbms_Output.Put_Line('redo generated : '||(Redosize2-Redosize1)|| ' Time taken : '||(t2-t1));
END;
/

Friday, September 12, 2014

Adding Far Sync Instances to Existing Data Guard Configuration

This post list the steps of adding far sync instances to an existing data guard configuration (for both primary and standby databases, to be used when standby become primary). Far sync is an new feature introduced with 12c which allows transporting of redo data synchronously from primary DB to a "near by" far sync instance, which then transport the redo data asynchronously "over a longer distance". The idea is that there's low overhead on primary when transporting redo synchronously to a "near by" far sync instance compared to transporting "over a long distance" to achieve zero data loss as well as off-loading of the redo transport.
Oracle documentation provides a complete description of the far sync concept. Given below are few important excerpts
Many configurations have a primary database shipping redo to a standby database using asynchronous transport at the risk of some data loss at failover time. Using synchronous redo transport to achieve zero data loss may not be a viable option because of the impact on the commit response times at the primary due to network latency between the two databases. Creating a far sync instance close to the primary has the benefit of minimizing impact on commit response times to an acceptable threshold (due to the smaller network latency between primary and far sync instance) while allowing for higher data protection guarantees -- if the primary were to fail, and assuming the far sync instance was synchronized at the time of the failure, the far sync instance and the terminal standby would coordinate a final redo shipment from the far sync instance to the standby to ship any redo not yet available to the Standby and then perform a zero-data-loss failover.
A far sync instance manages a control file, receives redo into standby redo logs (SRLs), and archives those SRLs to local archived redo logs. A far sync instance does not have user data files, cannot be opened for access, cannot run redo apply, and can never function in the primary role or be converted to any type of standby database.
Far sync instances are part of the Oracle Active Data Guard Far Sync feature, which requires an Oracle Active Data Guard license.
In a configuration that contains a far sync instance, there must still be a direct network connection between the primary database and the remote standby database. The direct connection between the primary and the remote standby is used to perform health checks and switchover processing tasks. It is not used for redo transport unless the standby has been configured as an alternate destination in case the far sync instance fails and there is no alternate far sync configured to maintain the protection level.
The existing data guard configuration's primary database parameter setting and active data guard creation script of the standby database is given below. (other prerequisites for setting up data guard is omitted)
Primary database parameter changes
alter system set log_archive_config='dg_config=(ent12c1,ent12c1s)' scope=both ;
alter system set log_archive_dest_1='location=use_db_recovery_file_dest valid_for=(all_logfiles,all_roles) db_unique_name=ent12c1' scope=both;
alter system set log_archive_dest_2='service=ENT12C1STNS ASYNC NOAFFIRM max_failure=10 max_connections=5 reopen=180 valid_for=(online_logfiles,primary_role) db_unique_name=ent12c1s' scope=both;
alter system set log_archive_dest_state_2='defer' scope=both;
alter system set log_archive_dest_state_1='enable' scope=both;
alter system set fal_server='ENT12C1STNS' scope=both;
alter system set log_archive_max_processes=10 scope=both;
alter system set db_file_name_convert='/opt/app/oracle/oradata/ENT12C1S','/data/oradata/ENT12C1' scope=spfile;
alter system set log_file_name_convert='/opt/app/oracle/oradata/ENT12C1S','/data/oradata/ENT12C1' ,'/opt/app/oracle/fast_recovery_area/ENT12C1S','/data/flash_recovery/ENT12C1' scope=spfile;
alter system set standby_file_management='AUTO' scope=both;
Standby creation script
mkdir -p  /opt/app/oracle/oradata/ENT12C1S/controlfile
mkdir -p /opt/app/oracle/fast_recovery_area/ENT12C1S/controlfile

duplicate target database for standby from active database spfile
parameter_value_convert 'ent12c1','ent12c1s','ENT12C1','ENT12C1S','data','opt/app/oracle','flash_recovery','fast_recovery_area'
set db_unique_name='ent12c1s'
set db_create_file_dest='/opt/app/oracle/oradata'
set db_recovery_file_dest='/opt/app/oracle/fast_recovery_area'
set db_file_name_convert='/data/oradata/ENT12C1','/opt/app/oracle/oradata/ENT12C1S'
set log_file_name_convert='/data/oradata/ENT12C1','/opt/app/oracle/oradata/ENT12C1S','/data/flash_recovery/ENT12C1','/opt/app/oracle/fast_recovery_area/ENT12C1S'
set log_archive_max_processes='10'
set fal_server='ENT12C1TNS'
set log_archive_dest_2='service=ENT12C1TNS ASYNC NOAFFIRM max_failure=10 max_connections=5 reopen=180 valid_for=(online_logfiles,primary_role) db_unique_name=ent12c1'
set log_archive_dest_1='location=use_db_recovery_file_dest valid_for=(all_logfiles,all_roles) db_unique_name=ent12c1s';
This single instance data guard configuration which transport redo asynchronously is transformed as show below.
In situation 1 ENT12C1 is the primary database while ENT12C1S is the standby and FS12C1 is the far sync instance the primary database ships redo synchronously.
After a role switch when the ENT12C1S becomes the new primary then it uses FS12C1S as the far sync instance to transport redo synchronously.
In both situation there exists a direct redo transport path between primary and standby and this will transport redo asynchronously in case of far sync instance failure. Once the far sync is backup again the data guard configuration will revert to using the far sync instance for redo transport. If the standby redo logs were created on the primary then when far sync instances are in use for redo transport standby redo logs will be created automatically for them.
1. On the servers used for creating the far sync instances, install the oracle database software and create a listener. There's no requirement for creating static listener configuration as far sync instance automatically registers with the listener.
2. Create TNS entries for far sync instances on the existing databases (primary and standby) and copy the existing TNS entries into the far sync instances tnsnames.ora file.
cat tnsnames.ora

ENT12C1TNS =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = ent12c1-host)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = ent12c1)
    )
  )

ENT12C1STNS =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = ent12c1s-host)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = ent12c1)
    )
  )

FS12C1TNS =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = fs12c1-host)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = fs12c1)
    )
  )

FS12C1STNS =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = fs12c1s-host)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = fs12c1s)
    )
  )
3. Create a control file for the far sync instances by connecting to the primary database. Same control file is used for both far sync instances in this case (fs12c1 and fs12c1s).
SQL> ALTER DATABASE CREATE FAR SYNC INSTANCE CONTROLFILE AS '/home/oracle/controlfs.ctl';
4. Copy the control file to the far sync instances. In this case the control file is multiplexed and renamed as control01.ctl and control02.ctl on the far sync instances (shown on the pfiles in subsequent steps).
scp controlfs.ctl fs12c1-host:/opt/app/oracle/oradata/FS12C1/controlfile/controlfs01.ctl
scp controlfs.ctl fs12c1s-host:/opt/app/oracle/oradata/FS12C1/controlfile/controlfs01.ctl
Similarly copy the password file from the primary to $ORACLE_HOME/dbs on the servers where far sync instances will be created. For far sync instances fs12c1 and fs12c1s the password file need to be renamed as orapwfs12c1 and orapwfs12c1s respectively.

5. Create pfile from the primary spfile. This will be modified to reflect the far sync instance settings.
SQL> create pfile='/home/oracle/pfilefs.ora' from spfile;
6. Copy the pfile to far sync instances ($ORACLE_HOME/dbs) and rename them to reflect the instance names (eg. initfs12c1.ora and initfs12c1s.ora). Modify the init file used for the primary far sync instance (fs12c1) as shown below. Not all the parameters are needed for far sync and those could be removed from the pfile.
cat initfs12c1.ora

*.audit_file_dest='/opt/app/oracle/admin/fs12c1/adump'
*.audit_trail='OS'
*.compatible='12.1.0.2'
*.control_files='/opt/app/oracle/oradata/FS12C1/controlfile/controlfs01.ctl','/opt/app/oracle/oradata/FS12C1/controlfile/controlfs02.ctl'
*.db_block_size=8192
*.db_create_file_dest='/opt/app/oracle/oradata'
*.db_name='ent12c1'
*.db_recovery_file_dest_size=21474836480
*.db_recovery_file_dest='/opt/app/oracle/fast_recovery_area'
*.db_unique_name='fs12c1'
*.diagnostic_dest='/opt/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=fs12c1XDB)'
*.fal_server='ENT12C1TNS'
*.log_archive_config='dg_config=(ent12c1,ent12c1s,fs12c1,fs12c1s)'
*.log_archive_dest_1='location=use_db_recovery_file_dest valid_for=(all_logfiles,all_roles) db_unique_name=fs12c1'
*.log_archive_dest_2='service=ENT12C1STNS ASYNC NOAFFIRM valid_for=(STANDBY_LOGFILES,standby_role) db_unique_name=ent12c1s max_failure=10 max_connections=5 reopen=180'
*.log_archive_dest_state_1='enable'
*.log_archive_dest_state_2='enable'
*.log_archive_format='%t_%s_%r.dbf'
*.log_archive_max_processes=10
*.log_file_name_convert='/data/oradata/ENT12C1','/opt/app/oracle/oradata/FS12C1','/data/flash_recovery/ENT12C1','/opt/app/oracle/fast_recovery_area/FS12C1'
*.remote_login_passwordfile='EXCLUSIVE'
*.standby_file_management='AUTO'
7. Mount the far sync instance using the pfile and then create a spfile from the pfile. Without the spfile a warning is shown when the far sync is added to a data guad broker configuration. Beside that having a spfile also helps with any subsequent parameter changes without the need to restart the far sync instance. Restart (mount) the far sync instance using the spfile.



8. Similarly create the pfile for the far sync instance used by current standby (when it becomes the primary) FS12C1S.
cat initfs12c1s.ora

*.audit_file_dest='/opt/app/oracle/admin/fs12c1s/adump'
*.audit_trail='OS'
*.compatible='12.1.0.2'
*.control_files='/opt/app/oracle/oradata/FS12C1S/controlfile/control01.ctl','/opt/app/oracle/fast_recovery_area/FS12C1S/controlfile/control02.ctl'
*.db_create_file_dest='/opt/app/oracle/oradata'
*.db_name='ent12c1'
*.db_recovery_file_dest_size=21474836480
*.db_recovery_file_dest='/opt/app/oracle/fast_recovery_area'
*.db_unique_name='fs12c1s'
*.diagnostic_dest='/opt/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=fs12c1sXDB)'
*.fal_server='ENT12C1STNS'
*.log_archive_config='dg_config=(ent12c1,ent12c1s,fs12c1,fs12c1s)'
*.log_archive_dest_1='location=use_db_recovery_file_dest valid_for=(all_logfiles,all_roles) db_unique_name=fs12c1s'
*.log_archive_dest_2='service=ENT12C1TNS ASYNC NOAFFIRM valid_for=(standby_logfiles,standby_role) db_unique_name=ent12c1 max_failure=10 max_connections=5 reopen=180'
*.log_archive_dest_state_1='enable'
*.log_archive_dest_state_2='enable'
*.log_archive_format='%t_%s_%r.dbf'
*.log_archive_max_processes=10
*.log_file_name_convert='/data/oradata/ENT12C1','/opt/app/oracle/oradata/FS12C1S','/data/flash_recovery/ENT12C1','/opt/app/oracle/fast_recovery_area/FS12C1S'
*.remote_login_passwordfile='EXCLUSIVE'
*.standby_file_management='AUTO'
9. Similar to earlier (step 7) create a spfile and restart (mount) the far sync instance using the spfile.

10. Update the log archive config parameter on both primary and standby to include the far sync instance information as well.
alter system set log_archive_config='dg_config=(ent12c1,ent12c1s,fs12c1,fs12c1s)' scope=both ;
11. Update the fal server parameter on the primary (ent12c1) as below which allows the ent12c1 to fetch archive logs (when it becomes a standby) either from the primary (ent12c1s) or from the far sync instance (fs12c1s).
alter system set fal_server='ENT12C1STNS','FS12C1STNS' scope=both;
12. Update the fal server parameter on the standby (ent12c1s) so that it can fetch the archive logs either from the ent12c1 (primary) or far sync instance (fs12c1).
alter system set fal_server='ENT12C1TNS','FS12C1TNS' scope=both;
13. Update the log archive destination and log archive destination state on the primary such that redo transport is synchronized between primary and far sync and asynchronous between the primary and standby (direct). Further more the asynchronous log archive destination is set with state alternate so that when the synchronous log archive destination fails the data guard configuration start shipping redo via this alternate log archive destination.
alter system set log_archive_dest_state_2='enable' scope=both;
alter system set log_archive_dest_2='service=FS12C1TNS SYNC AFFIRM  db_unique_name=fs12c1  max_failure=1 valid_for=(online_logfiles,primary_role) alternate=log_archive_dest_3 max_connections=5' scope=both;

alter system set log_archive_dest_state_3='alternate' scope=both;
alter system set log_archive_dest_3='service=ENT12C1STNS ASYNC NOAFFIRM  db_unique_name=ent12c1s  valid_for=(online_logfiles,primary_role) alternate=log_archive_dest_2 max_failure=10 max_connections=5 reopen=180' scope=both;
14. Change and add log archive destination settings on the standby database so that when it becomes primary (ent12c1s) it too can use the far sync instance for synchronous redo transport and failing that use asynchronous redo transport directly with the standby at the time (ent12c1)
alter system set log_archive_dest_state_2='enable' scope=both;
alter system set log_archive_dest_2='service=FS12C1STNS SYNC AFFIRM db_unique_name=fs12c1s valid_for=(online_logfiles,primary_role) alternate=log_archive_dest_3 max_connections=5  max_failure=1' scope=both;

alter system set log_archive_dest_state_3='alternate' scope=both;
alter system set log_archive_dest_3='service=ENT12C1TNS ASYNC NOAFFIRM  db_unique_name=ent12c1 valid_for=(online_logfiles,primary_role) alternate=log_archive_dest_2 max_failure=10 max_connections=5 reopen=180' scope=both;
15. Increase the protection mode of the primary database to maximum availability.
ALTER DATABASE SET STANDBY TO MAXIMIZE AVAILABILITY;

SQL> SELECT PROTECTION_MODE FROM V$DATABASE;

PROTECTION_MODE
--------------------
MAXIMUM AVAILABILITY
16. Do few log switches and verify that redo transport is happening via the far sync instance. Easiest way to monitor is through the alert log, which log switch will be logged on both far sync instance alert log and standby instance alert log if it was transported via the far sync. If there's any issues with regard to log archive destination this could be observed on the primary
SQL> SELECT DEST_NAME,STATUS,DESTINATION FROM V$ARCHIVE_DEST WHERE DESTINATION IS NOT NULL;

DEST_NAME                      STATUS    DESTINATION
------------------------------ --------- ------------------------------
LOG_ARCHIVE_DEST_1             VALID     USE_DB_RECOVERY_FILE_DEST
LOG_ARCHIVE_DEST_2             VALID     FS12C1TNS
LOG_ARCHIVE_DEST_3             ALTERNATE ENT12C1STNS
and on the far sync instance
SQL> SELECT DEST_NAME,STATUS,DESTINATION FROM V$ARCHIVE_DEST WHERE DESTINATION IS NOT NULL;

DEST_NAME                      STATUS    DESTINATION
------------------------------ --------- ------------------------------
LOG_ARCHIVE_DEST_1             VALID     USE_DB_RECOVERY_FILE_DEST
LOG_ARCHIVE_DEST_2             VALID     ENT12C1STNS
STANDBY_ARCHIVE_DEST           VALID     USE_DB_RECOVERY_FILE_DEST
Above output shows that primary is transporting to log archive dest 2 and status valid and dest 3 is still in alternative state. On the far sync output it shows that far sync instance is shipping redo as per its log archive dest 2 value and current status is valid.

17. Shutdown abort the far instance and view the archive dest output. If the configuration works properly then log archive dest 3 should be valid and redo transport should be happening directory between primary and standby in asynchronous mode.
SQL> SELECT DEST_NAME,STATUS,DESTINATION FROM V$ARCHIVE_DEST WHERE DESTINATION IS NOT NULL;

DEST_NAME                      STATUS    DESTINATION
------------------------------ --------- ------------------------------
LOG_ARCHIVE_DEST_1             VALID     USE_DB_RECOVERY_FILE_DEST
LOG_ARCHIVE_DEST_2             ALTERNATE FS12C1TNS
LOG_ARCHIVE_DEST_3             VALID     ENT12C1STNS
From the above output it could be seen that after far sync instance is terminated the log archive dest 3 has become the valid destination and log archive dest 2 is kept as an alternate destination. On the primary instance alert log following could be observed when the far sync instance is terminated
Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED
LGWR: Failed to archive log 1 thread 1 sequence 1503 (3113)
Destination LOG_ARCHIVE_DEST_2 no longer supports SYNCHRONIZATION
and on the standby instance alert log the following
Primary database is in MAXIMUM PERFORMANCE mode
Changing standby controlfile to MAXIMUM PERFORMANCE mode
It must be also mentioned on few occasions where the far sync was abruptly terminated (shutdown abort) the recovery process on standby got stopped due to lost writes
MRP0: Background Media Recovery terminated with error 742  <-- far sync instance terminated
Fri Sep 12 17:32:28 2014
Errors in file /opt/app/oracle/diag/rdbms/ent12c1s/ent12c1s/trace/ent12c1s_pr00_10098.trc:
ORA-00742: Log read detects lost write in thread 1 sequence 1503 block 868
ORA-00312: online log 4 thread 1: '/opt/app/oracle/fast_recovery_area/ENT12C1S/onlinelog/o1_mf_4_b0y1dn8v_.log'
ORA-00312: online log 4 thread 1: '/opt/app/oracle/oradata/ENT12C1S/onlinelog/o1_mf_4_b0y1dml4_.log'
Managed Standby Recovery not using Real Time Apply
RFS[16]: Assigned to RFS process (PID:10165)
RFS[16]: Selected log 5 for thread 1 sequence 1504 dbid 209099011 branch 833730501
Fri Sep 12 17:32:28 2014
Recovery interrupted!
Recovered data files to a consistent state at change 19573793
Fri Sep 12 17:32:28 2014
Errors in file /opt/app/oracle/diag/rdbms/ent12c1s/ent12c1s/trace/ent12c1s_pr00_10098.trc:
ORA-00742: Log read detects lost write in thread 1 sequence 1503 block 868
ORA-00312: online log 4 thread 1: '/opt/app/oracle/fast_recovery_area/ENT12C1S/onlinelog/o1_mf_4_b0y1dn8v_.log'
ORA-00312: online log 4 thread 1: '/opt/app/oracle/oradata/ENT12C1S/onlinelog/o1_mf_4_b0y1dml4_.log'
Fri Sep 12 17:32:28 2014
MRP0: Background Media Recovery process shutdown (ent12c1s)
Fri Sep 12 17:32:28 2014
Archived Log entry 107 added for thread 1 sequence 1503 rlc 833730501 ID 0xde697d0 dest 3:
Fri Sep 12 17:38:51 2014
alter database recover managed standby database disconnect  <-- manual start at 17:38 after it was stopped 17:32
So it maybe good idea to keep an eye on the recovery when the far sync instance terminates. According to 1302539.1 when active data guard is in place the there's automatic block repair transparent to the user. Once the far sync instance is backup again the redo transport will go back to original setting
SQL>  SELECT DEST_NAME,STATUS,DESTINATION FROM V$ARCHIVE_DEST WHERE DESTINATION IS NOT NULL;

DEST_NAME                      STATUS    DESTINATION
------------------------------ --------- ------------------------------
LOG_ARCHIVE_DEST_1             VALID     USE_DB_RECOVERY_FILE_DEST
LOG_ARCHIVE_DEST_2             VALID     FS12C1TNS
LOG_ARCHIVE_DEST_3             ALTERNATE ENT12C1STNS
And following could be observed on the alert log of primary
Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED
LGWR: Standby redo logfile selected to archive thread 1 sequence 1506
LGWR: Standby redo logfile selected for thread 1 sequence 1506 for destination LOG_ARCHIVE_DEST_2
Fri Sep 12 16:21:08 2014
Thread 1 advanced to log sequence 1506 (LGWR switch)
  Current log# 1 seq# 1506 mem# 0: /data/oradata/ENT12C1/onlinelog/o1_mf_1_9bcsl3ds_.log
  Current log# 1 seq# 1506 mem# 1: /data/flash_recovery/ENT12C1/onlinelog/o1_mf_1_9bcsl3hm_.log
Fri Sep 12 16:21:08 2014
Archived Log entry 1408 added for thread 1 sequence 1505 ID 0xde697d0 dest 1:
Fri Sep 12 16:21:11 2014
Thread 1 cannot allocate new log, sequence 1507
Checkpoint not complete
  Current log# 1 seq# 1506 mem# 0: /data/oradata/ENT12C1/onlinelog/o1_mf_1_9bcsl3ds_.log
  Current log# 1 seq# 1506 mem# 1: /data/flash_recovery/ENT12C1/onlinelog/o1_mf_1_9bcsl3hm_.log
Fri Sep 12 16:21:14 2014
Destination LOG_ARCHIVE_DEST_2 is SYNCHRONIZED
On the alert log of the far sync instance
Primary database is in MAXIMUM AVAILABILITY mode
Changing standby controlfile to RESYNCHRONIZATION level
Standby controlfile consistent with primary
RFS[1]: Assigned to RFS process (PID:3557)
RFS[1]: Selected log 5 for thread 1 sequence 1506 dbid 209099011 branch 833730501
Fri Sep 12 17:04:06 2014
******************************************************************
TT00: Setting 'active' archival for destination LOG_ARCHIVE_DEST_2
******************************************************************
TT00: Standby redo logfile selected for thread 1 sequence 1506 for destination LOG_ARCHIVE_DEST_2
Fri Sep 12 17:04:07 2014
Changing standby controlfile to MAXIMUM PERFORMANCE mode
RFS[2]: Assigned to RFS process (PID:3561)
RFS[2]: Opened log for thread 1 sequence 1505 dbid 209099011 branch 833730501
Fri Sep 12 17:04:08 2014
Archived Log entry 203 added for thread 1 sequence 1505 rlc 833730501 ID 0xde697d0 dest 2:
Fri Sep 12 17:04:13 2014
Changing standby controlfile to MAXIMUM AVAILABILITY mode
On the standby alert log
Primary database is in MAXIMUM PERFORMANCE mode
Changing standby controlfile to MAXIMUM AVAILABILITY mode
Changing standby controlfile to RESYNCHRONIZATION level
Similarly it is possible to shutdown the standby instance and see if primary is able to ship redo to far sync instance and if the redo is fetch by standby once it is started again. With these test the situation 1 on the figure shown at the beginning of the post is complete.

18. To test the situation 2 on the figure above do a switchover and check the redo transport via the far sync (fs12c1s) instance.

This conclude adding of far sync instances to existing data guard configuration on 12c.

Useful metalink notes
Cascaded Standby Databases in Oracle 12c [ID 1542969.1]
Data Guard 12c New Feature: Far Sync Standby [ID 1565071.1]
Best Practices for Corruption Detection, Prevention, and Automatic Repair - in a Data Guard Configuration [ID 1302539.1]

Tuesday, September 2, 2014

LOB Chunk and Tablespace Block Size

Chunk value corresponds to the data size used by oracle when reading or writing a lob value. Once set chunk size cannot be changed. Though it doesn't matter for lobs stored in row, for out of row lobs the space is used in multiples of the chunk size.
From Oracle documentation (for basicfile lobs) A chunk is one or more Oracle blocks. You can specify the chunk size for the BasicFiles LOB when creating the table that contains the LOB. This corresponds to the data size used by Oracle Database when accessing or modifying the LOB value. Part of the chunk is used to store system-related information and the rest stores the LOB value. If the tablespace block size is the same as the database block size, then CHUNK is also a multiple of the database block size. The default CHUNK size is equal to the size of one tablespace block, and the maximum value is 32K. Once the value of CHUNK is chosen (when the LOB column is created), it cannot be changed. Hence, it is important that you choose a value which optimizes your storage and performance requirements.
For securefile CHUNK is an advisory size and is provided for backward compatibility purposes.
From performance perspective it is considered that accessing lobs in big chunks is more efficient. You can set CHUNK to the data size most frequently accessed or written. For example, if only one block of LOB data is accessed at a time, then set CHUNK to the size of one block. If you have big LOBs, and read or write big amounts of data, then choose a large value for CHUNK.
This post shows the result of a test case carried out to compare the performance benefits of using a large chunk size along with a tablespace with a large block size (8k vs 32k). The blob used for the test case is 800KB. The java code used for the test case is given at the end of the post. For each chunk size (8k vs 32k) the caching option was also changed (nocache vs cache) as they also have direct impact on the IO usage. The lob is stored out of row in a separate tablespace than the table.
The table creation DDL used for basicfile test is shown below (comment and uncomment each option based on the test case). The LOB32KTBS is a tablespace of 32k block size while LOB8KTBS is a tabelspace of 8k block size.
create table ses(sesid varchar2(100), sesob blob) SEGMENT CREATION IMMEDIATE TABLESPACE users
 LOB
  (
    sesob
  )
  STORE AS object_lob_seg (
  TABLESPACE LOB32KTBS
  --TABLESPACE LOB8KTBS
  DISABLE STORAGE IN ROW
  CHUNK 32K
  --CHUNK 8K
                CACHE
  --NOCACHE
  PCTVERSION 0
        STORAGE (MAXEXTENTS UNLIMITED)
        INDEX object_lob_idx (
            TABLESPACE LOB32KTBS
     --TABLESPACE LOB8KTBS
            STORAGE (MAXEXTENTS UNLIMITED)
        )
    )
/
Test cases comprised of reading a lob column for a row and inserting lobs. The IO related statistics comparison for select test case is given below. Based on the graphs it could be seen that 32K chunk size on a tablespace with a block size of 32K requires less number of logical or physical reads compared to having a 8k chunk and lob segment on a 8k block size tablespace. Though not shown on the graphs, on a separate test where using a 32k chunk size and placing the lob segment on a 8K block size tablespace had the same performance characteristics of having a 8k chunk on a 8k block size tablespace. On the other hand having a chunk of 8k and placing the lob segment on a 32k block size tablespace had the same performance characteristics of having a 32k chunk on a 32k block size tablespace. This means that chunk size alone is not going to reduce the amount of IO but the tablespace block size where the lob segment is stored has an influence as well.

The next test was the inserting of lob. The results are shown on the following two graphs. Similar to the read test, having a large chunk size and tablespace block size for lob reduces the IO.




The same test was carried out for securefile lob segments. The table creation DDL is given below. Only difference to the DDL compared to basicfile is the "retention none". All other parameters/options are the same.
create table ses(sesid varchar2(100), sesob blob) SEGMENT CREATION IMMEDIATE TABLESPACE users
 LOB
  (
    sesob
  )
  STORE AS  securefile object_lob_seg (
  TABLESPACE LOB32KTBS
  --TABLESPACE LOB8KTBS
  DISABLE STORAGE IN ROW
  CHUNK 32K
  --CHUNK 8K
                CACHE
  --NOCACHE
  RETENTION NONE
        STORAGE (MAXEXTENTS UNLIMITED)
        INDEX object_lob_idx (
            TABLESPACE LOB32KTBS
   --TABLESPACE LOB8KTBS
            STORAGE (MAXEXTENTS UNLIMITED)
        )
    )
/
The results of the select test is shown on the graphs below. Similar to basicfile the larger chunk size advisory and tablespace block size combination out perform the smaller chunk/block size combination. In all cases the securefile out perform basicfile for the amount of logical or physical reads.

The outcome for the insert test also same as that of basicfile insert test where larger chunk/block size combination out performs the smaller chunk/block size combination. Also between basicfile and secfile the secfile out performs the basicfile.

This tests have shown that it's better to use large chunk/tablespace block sizes for larger LOBs to reduce logical/physical IO related to LOBs.

Useful White papers
SecureFile Performance
Oracle 11g: SecureFiles

Related Post
Nologging vs Logging for LOBs and enq: CF - contention

Java code used for the test. For the code of LobStat class refer the earlier post
public class LobChunkTest {

    final String URL = "jdbc:oracle:thin:@192.168.0.66:1521:ent11g2";
    final String USERNAME = "asanga";
    final String PASSWORD = "asa";

    public static void main(String[] args) {

        LobChunkTest test = new LobChunkTest();
        //Insert test
        test.insertTest();

        System.out.println("\n\n************* end of insert test **************\n\n");

        //select test
        test.selectTest();

    }

    public void insertTest() {
        try {

            OracleDataSource pool = new OracleDataSource();
            pool.setURL(URL);
            pool.setUser(USERNAME);
            pool.setPassword(PASSWORD);

            Connection con = pool.getConnection();
            con.setAutoCommit(false);

            long t1 = System.currentTimeMillis();

            LobStat.displayStats(con);

            byte[] x = new byte[800 * 1024];
            x[1] = 10;
            x[798 * 1024] = 20;

            for (int i = 0; i < 100; i++) {

                OraclePreparedStatement pr = (OraclePreparedStatement) con.prepareStatement("insert into ses values(?,?)");

                String sesid = "abcdefghijklmnopqrstuvwxy" + Math.random();
                pr.setString(1, sesid);
                pr.setBytes(2, x);

                pr.execute();
                con.commit();
                pr.close();

            }

            long t2 = System.currentTimeMillis();

            LobStat.displayStats(con);

            con.close();

            System.out.println("time taken " + (t2 - t1));

        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }

    public  void selectTest() {
        try {

            OracleDataSource pool = new OracleDataSource();
            pool.setURL("jdbc:oracle:thin:@192.168.0.66:1521:ent11g2");
            pool.setUser("asanga");
            pool.setPassword("asa");

            Connection con = pool.getConnection();
            con.setAutoCommit(false);

            String[] sesids = new String[100];

            PreparedStatement pr1 = con.prepareStatement("select sesid from ses");
            ResultSet rs1 = pr1.executeQuery();
            int i = 0;
            while (rs1.next()) {

                sesids[i] = rs1.getString(1);

                i++;

            }
            rs1.close();
            pr1.close();
            con.close();

            con = pool.getConnection();
            LobStat.displayStats(con);

            OraclePreparedStatement pr = (OraclePreparedStatement) con.prepareStatement("select SESOB from ses where sesid=?");

            long t1 = System.currentTimeMillis();
            for (String x : sesids) {


                pr.setString(1, x);
                ResultSet rs = pr.executeQuery();

                while (rs.next()) {

                    byte[] blob = rs.getBytes(1);

                }

                rs.close();
            }

            long t2 = System.currentTimeMillis();
            System.out.println("time taken " + (t2 - t1));

            LobStat.displayStats(con);

            pr.close();

            con.close();

        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }
}