Friday, October 24, 2014

APPEND_VALUES Hint and JDBC

APPEND_VALUE hint was introduced in 11gR2 for direct path inserts with values clause. Append hint is only useful when doing direct path loading with a select sub-query. On Oracle documentation mention append_value hint is useful in enhnacing performance and list OCI program and PL/SQL as example. There's no mention of JDBC. Is there a performance gain when using append_value hint when the inserts are issued through JDBC?
This is a simple test case that tried to answer this question. The test case involves inserting 50000 rows to a single column table. The insert statement is issued with and without (conventional insert) the hint. Also the table is created with and without logging enabled. Append* hints generate less redo when the table has nologging enabled. But this is not entirely dependent on table but also other factors as well. The test measures redo size, cpu used by session and the total elapsed time to insert the rows (this is measured on the side of the java test code) statistics. Test is repeated for above combination employing JDBC batching as well. The java code used for the test case is given at the end of the post. The tests were carried out on 11.2.0.3 EE database. (update 2014/10/24: Same test was done on 12.1.0.2 EE DB. The results more or less the same. In 12.1.0.2 even nologging + batching with append hint didn't out perform same test without the hint)
First comparison is the size of the redo generated during the inserts.
The first graph shows the redo generated without batching inserts. There's no surprise that when the table is in nologging mode the amount of redo generated is less than when table is in logging mode. But not having the append hint (in this post append hint means append_value hint) seems to generate less redo than having the append hint. This is true when table is logging and nologging mode. On the other hand when the inserts are batched and when table is in nologging having the append hint results in minimum redo being generated and this amount is less than the redo generated without the append hint (in both logging and nologging modes). If the table is in logging mode then batching without the append hint results in less redo generated compared to using the append hint.
The next statistic is the CPU used for completing the inserts. The CPU used by this session statistics is used to calculate this by capturing CPU statistics value before and after inserts.
Batching the inserts results in minimum CPU being consumed for inserts compared to not batching. There's no great deal difference in the CPU consumption when the append hint is used compared to it not being used. However when inserts are not batched the amount of CPU consumed is doubled and tripled when append hint is used compared to without the hint. So in terms of CPU consumption, having the append hint and not batching the inserts will result in performance degradation.
Final statistics is the total elapsed time to insert the rows. This is roughly the total execution time for the test code. The time is measured in milliseconds.
Similar to CPU usage, batching results in lowest elapsed time. This is no surprise as CPU is a component of the overall elapsed time. However when inserts are not batched then having the append hint results in high elapsed time compared to inserting without the append hint.
From this limited test case it seems that only time that's beneficial to use append hint with JDBC is when inserts are batched and table is in nologging mode. Other times the non batched inserts and batched inserts out perform the effects of append hint. But having nologging table may not be an option in a production system and even if it is possible have a nologging table, highly concurrent inserts into a nologging table could results high number of control file related wait events.
Furthermore there few other points to consider before deciding to use append hint in the application. When append_value is used to insert to a table another session cannot insert until first session commits. The second session will hang and it will be waiting on a enq: TM - contention wait event which is usually associated with unindexed foreign key related issues. So the concurrent nature of the inserts must be considered. If the inserts are highly concurrent then having the append hint may not be a good idea.
Within the same session after one insert another cannot be made without first committing the previous insert.
SQL> set auto off
SQL>  insert /*+ append_values */ into append_hint_test values ('abc');

1 row created.

SQL> insert /*+ append_values */ into append_hint_test values ('def');
insert /*+ append_values */ into append_hint_test values ('def')
                                 *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel
Therefore java codes that's been written to reuses the cursors and have auto commit set to false will encounter following error
Exception in thread "main" java.sql.SQLException: ORA-12838: cannot read/modify an object after modifying it in parallel
Also the append hint results in direct loading of data to the end of the table. This results in continuous growth of the table even if there's free space available (which may or may not be a problem in some cases). Therefore it maybe better to use traditional batching with JDBC than using append hint as the negative consequence of using it seem to out weigh the gains.



On the other hand in PL/SQL with batch inserts (using FORALL) the append hint seem to out perform the conventional inserts. PL/SQL code used is also given at the end of the post. Below graph shows the redo size generated for inserting 50000 rows with and without append hint.


Create table for the test cases.
create table append_hint_test(value varchar2(50));
create table append_hint_plsql_test(value varchar2(50));
Java Code Used.
public class AppendHintTest {

    public static void main(String[] args) throws Exception {

        OracleDataSource dataSource = new OracleDataSource();
        dataSource.setURL("jdbc:oracle:thin:@192.168.0.66:1521:ent11g2");
        dataSource.setUser("asanga");
        dataSource.setPassword("asa");

        Connection con = dataSource.getConnection();
//               con.setAutoCommit(false);  
        DBStats stats = new DBStats();
        stats.initStats(con);

        String SQL = "insert /*+ APPEND_VALUES */ into append_hint_test values (?)";
//               String SQL = "insert into append_hint_test values (?)"; 

        PreparedStatement pr = con.prepareStatement(SQL);


        long t1 = System.currentTimeMillis();

        for (int i = 0; i < 50000; i++) {
            pr.setString(1, "akhgaipghapga " + i);
            pr.execute();

//                   pr.addBatch();
//                   if(i%1000==0){
//                       pr.executeBatch();

//                   }
        }

//               pr.executeBatch();
        con.commit();
        pr.close();

        long t2 = System.currentTimeMillis();
        String[][] statsValues = stats.getStatsDiff(con);
        con.close();
        System.out.println("time taken " + (t2 - t1));

        for (String[] x : statsValues) {

            System.out.println(x[0] + " : " + x[1]);
        }

    }
}

public class DBStats {

    private HashMap stats = new HashMap<>();
    private String SQL = "select name,value " + "from v$mystat,v$statname " + "where v$mystat.statistic#=v$statname.statistic# "
            + "and v$statname.name in ('CPU used when call started','CPU used by this session','db block gets',"
            + "'db block gets from cache','db block gets from cache (fastpath)','db block gets direct',"
            + "'consistent gets','consistent gets from cache','consistent gets from cache (fastpath)',"
            + "'consistent gets - examination','consistent gets direct','physical reads',"
            + "'physical reads direct','physical read IO requests','physical read bytes',"
            + "'consistent changes','physical writes','physical writes direct',"
            + "'physical write IO requests','physical writes from cache','redo size')";

    public void initStats(Connection con) {
        try {
            PreparedStatement pr = con.prepareStatement(SQL);

            ResultSet rs = pr.executeQuery();


            while (rs.next()) {

                stats.put(rs.getString(1), rs.getLong(2));
            }

            rs.close();
            pr.close();
        } catch (SQLException ex) {
            ex.printStackTrace();
        }

    }

    public String[][] getStatsDiff(Connection con) {
        
        String[][] statDif = new String[stats.size()][2];

        try {
            PreparedStatement pr = con.prepareStatement(SQL);

            ResultSet rs = pr.executeQuery();

            int i = 0;
            while (rs.next()) {
               Long val = rs.getLong(2) - stats.get(rs.getString(1));
               statDif[i][0] = rs.getString(1);
               statDif[i][1] = val.toString();
               i++;
            }

            rs.close();
            pr.close();
        } catch (SQLException ex) {
            ex.printStackTrace();
        }


        return statDif;


    }
}
PL/SQL code used. Before running the PL/SQL test populate the Append_Hint_Test table with rows using java code above.
SET serveroutput ON
DECLARE
Type Arry_Type
IS
  TABLE OF Loadt%Rowtype INDEX BY PLS_INTEGER;
  Loadtable Arry_Type;
  redosize1 NUMBER;
  redosize2 NUMBER;
  t1        NUMBER;
  t2        NUMBER;
Begin
  EXECUTE immediate 'truncate table append_hint_plsql_test';
  Select * Bulk Collect Into Loadtable From Append_Hint_Test ;
  
  Dbms_Output.Put_Line(Loadtable.Count);
  
  SELECT Value
  INTO Redosize1
  FROM V$mystat,
    V$statname
  Where V$mystat.Statistic#=V$statname.Statistic#
  AND V$statname.Name      = 'redo size';
  
  Dbms_Output.Put_Line('redo size 1 '||Redosize1);
  T1          := Dbms_Utility.Get_Time;
  
  Forall Indx IN 1 .. Loadtable.Count
--  INSERT /*+ APPEND_VALUES */
--  INTO append_hint_plsql_test VALUES
--    (Loadtable(Indx).A
--    );
  Insert Into Append_Hint_Plsql_Test Values   (Loadtable(Indx).A   );

  Commit;
  
  T2 := Dbms_Utility.Get_Time;
  
  SELECT Value
  INTO Redosize2
  FROM V$mystat,
    V$statname
  WHERE V$mystat.Statistic#=V$statname.Statistic#
  AND V$statname.Name      = 'redo size';
  Dbms_Output.Put_Line('redo size 2 '||Redosize2);
  Dbms_Output.Put_Line('redo generated : '||(Redosize2-Redosize1)|| ' Time taken : '||(t2-t1));
END;
/