Tuesday, August 7, 2012

Variable Length In-Clause : SQL vs PL/SQL Method Comparison

When the number of variables in a in-clause varies this results in a new statement being created. The widely available solution (after some googling) for this is to create a PL/SQL function and use a table type to run the in-clause where only one statement is created and will get re-used irrespective of number of parameters. This is a comparison between these two options of using the PL/SQL method and creating unique SQL for each set of in-clause variables.
For PL/SQL option create a table type and function as below
create or replace type mytabletype as table of number;

create or replace function mytablefun(input in varchar2) return
mytabletype
as 
    l_str   LONG DEFAULT input || ',';
    l_n     NUMBER;
    l_data  mytabletype := mytabletype();
BEGIN
    LOOP
        l_n := instr( l_str, ',' );
        exit when (nvl(l_n,0) = 0);
        l_data.extend;
        l_data( l_data.count ) := ltrim(rtrim(substr(l_str,1,l_n-1)));
        l_str := substr( l_str, l_n+1 );
    END LOOP;
    RETURN l_data;
END;
/
The test case is run using a java program which generate the sql statments. Code is given below
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;

public class InclauseTest {

    public static void main(String[] args)  {
        try {

            //Connection pool creation code eg. OracleDataSource or UDP

            Connection con = pool.getConnection();

            long t1 = System.currentTimeMillis();

            newsqlMehtod(con);       //runs sql method test
//            newfunctionMethod(con);  // uncomment to run pl/sql method test

            long t2 = System.currentTimeMillis();

            DBStats.displayStats(con);


            System.out.println("\n\n total time taken : "+(t2-t1));
            con.close();

        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }


 public static void newsqlMehtod(Connection con){

        String sql1 = "select * from x where a in (";
                        
        for (int i = 0 ; i < 10000; i++){
            
            try {
                int k = i % 10;
                PreparedStatement pr = con.prepareStatement(sql1+prepareStringCreate(k)+")");

                preparePopulate(pr, k+1);


                ResultSet rs = pr.executeQuery();

                while(rs.next()){

                    int q = rs.getInt(1);
                    double p = rs.getDouble(2);
                }

                rs.close();
                pr.close();


            } catch (Exception ex) {
                ex.printStackTrace();
            }
        }
    }

  public static String prepareStringCreate( int i){
        
        StringBuilder x = new StringBuilder("?");

        for(int j = 0; j< i; j++){
            
            x.append(",?");
        }
        
        return x.toString();
        
    }

  public static void preparePopulate(PreparedStatement pr, int i) throws SQLException{

        for(int j = 0; j< i; j++){

            pr.setInt(j+1, (j+1)* 10);
        }

    }

 public static void newfunctionMethod(Connection con){

             String sql = "select * from x where a in (select * from table(SELECT CAST( mytablefun( ? ) AS mytabletype ) FROM dual))";

        for (int i = 0 ; i < 10000; i++){
            try {
                PreparedStatement pr = con.prepareStatement(sql);

               StringBuilder x = new StringBuilder("10");

               int q = i%10;

               for (int j = 0; j<q  ; j++){

                   x.append(","+((j+2)*10));
               }

                
                pr.setString(1, x.toString());
                ResultSet rs = pr.executeQuery();

                while(rs.next()){

                    int k = rs.getInt(1);
                    double p = rs.getDouble(2);
                }

                rs.close();
                pr.close();

            } catch (Exception ex) {
                ex.printStackTrace();
            }
        }

    }
}

import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;

public class DBStats {

    public static void displayStats(Connection con) {
        try {
            PreparedStatement pr = con.prepareStatement("select name,value " + "from v$mystat,v$statname " + "where v$mystat.statistic#=v$statname.statistic# " 
//                    + "and v$statname.statistic# in (11,12)"); //11gR1
                    + "and v$statname.statistic# in (16,17)"); //11gR2

            ResultSet rs = pr.executeQuery();


            while(rs.next()){

                System.out.println(rs.getString(1)+" "+rs.getDouble(2));
            }

            rs.close();
            pr.close();
        } catch (SQLException ex) {
            ex.printStackTrace();
        }

    }
}
Running the test case once will result in 10000 sql statements being executed with number of values used in the in-clause varying from 1 - 10. The DBStats class is used to get CPU used by the session values. The java test program is run twice (resulting in 20000 executions) and SQL statistics are taken. Shared pool is flushed before the run but not between each run (so the statistics are cumulative).


First the results from SQL method. The session stats gives
first run
CPU used when call started 47.0
CPU used by this session 49.0

second run
CPU used when call started 43.0
CPU used by this session 43.0
SQL stats
SQL> select sql_id,cpu_time,elapsed_time,executions,SHARABLE_MEM,PERSISTENT_MEM,RUNTIME_MEM,
sql_text from v$sql where sql_text like 'select * from x where a in%' order by sql_text;

SQL_ID          CPU_TIME ELAPSED_TIME EXECUTIONS SHARABLE_MEM PERSISTENT_MEM RUNTIME_MEM SQL_TEXT
------------- ---------- ------------ ---------- ------------ -------------- ----------- ----------------------------------------------------------------------
26fb47dhnuz88      28994        94332       2000        19753           4496        3408 select * from x where a in (:1 )
068anap1mvbdx      46993       105790       2000        15701           4648        3528 select * from x where a in (:1 ,:2 )
387t0d508fkzb      55993       107300       2000        19777           4728        3576 select * from x where a in (:1 ,:2 ,:3 )
3hm2vcm9q6zup      35994       108100       2000        19797           4808        3624 select * from x where a in (:1 ,:2 ,:3 ,:4 )
7fg42cswna64p      55992       107610       2000        19825           4888        3672 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 )
89k5ysfx65zpu      52992       109258       2000        19845           4968        3720 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 )
32w3mnqx0vhf4      40994       109297       2000        19865           5048        3768 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 )
4yxvar786t71u      63987       110639       2000        19885           5128        3816 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 ,:8 )
7x0jrn3nj8ux6      54987       112536       2000        19913           5208        3864 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 ,:8 ,:9 )
0z3udxfqg2p0p      57998       112400       2000        19942           5288        3912 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 ,:8 ,:9 ,:10 )

10 rows selected.
As expected the SQL method results in 10 statements being created and each getting equal number of executions. Stats for each executions, although elapsed time per exec is roughly same CPU time values vary.
SQL> SELECT sql_id,
  2    cpu_time,
  3    round(cpu_time/executions,2) as cpu_per_exec,
  4    elapsed_time,
  5    round(elapsed_time/executions,2) as ela_per_exec,
  6    executions,
  7  --  SHARABLE_MEM,
  8  --    PERSISTENT_MEM,
  9  --  RUNTIME_MEM,
 10    sql_text
 11  FROM v$sql
 12  WHERE sql_text LIKE 'select * from x where a in%'
 13  ORDER BY sql_text;

SQL_ID        CPU_TIME CPU_PER_EXEC ELAPSED_TIME ELA_PER_EXEC EXECUTIONS SQL_TEXT
------------- -------- ------------ ------------ ------------ ---------- ---------------------------------------------------------------------
26fb47dhnuz88    28994         14.5        94332        47.17       2000 select * from x where a in (:1 )
068anap1mvbdx    46993         23.5       105790         52.9       2000 select * from x where a in (:1 ,:2 )
387t0d508fkzb    55993           28       107300        53.65       2000 select * from x where a in (:1 ,:2 ,:3 )
3hm2vcm9q6zup    35994           18       108100        54.05       2000 select * from x where a in (:1 ,:2 ,:3 ,:4 )
7fg42cswna64p    55992           28       107610        53.81       2000 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 )
89k5ysfx65zpu    52992         26.5       109258        54.63       2000 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 )
32w3mnqx0vhf4    40994         20.5       109297        54.65       2000 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 )
4yxvar786t71u    63987        31.99       110639        55.32       2000 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 ,:8 )
7x0jrn3nj8ux6    54987        27.49       112536        56.27       2000 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 ,:8 ,:9 )
0z3udxfqg2p0p    57998           29       112400         56.2       2000 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 ,:8 ,:9 ,:10 )
Next is the PL/SQL method, again java program is run twice and shared pool is flushed before the run but not between the runs. The session stats gives
first run
CPU used when call started 479.0
CPU used by this session 481.0

second run
CPU used when call started 267.0
CPU used by this session 267.0
CPU used by session is high compared to the SQL statement method earlier. The sql stats
SQL_ID          CPU_TIME ELAPSED_TIME EXECUTIONS SHARABLE_MEM PERSISTENT_MEM RUNTIME_MEM SQL_TEXT
------------- ---------- ------------ ---------- ------------ -------------- ----------- ----------------------------------------------------------------------
1p40d61qp4u6g    6333007      7284060      19997        28748           7040        5328 select * from x where a in (select * from table(SELECT CAST( mytablefu
                                                                                         n( :1  ) AS mytabletype ) FROM dual))

1p40d61qp4u6g      91986        91943          3        28748           7040        5328 select * from x where a in (select * from table(SELECT CAST( mytablefu
                                                                                         n( :1  ) AS mytabletype ) FROM dual))
Per exec values
SQL_ID            CPU_TIME CPU_PER_EXEC ELAPSED_TIME ELA_PER_EXEC EXECUTIONS SQL_TEXT
------------- ------------ ------------ ------------ ------------ ---------- ---------------------------------------------------------------------
1p40d61qp4u6g      6333007        316.7      7284060       364.26      19997 select * from x where a in (select * from table(SELECT CAST( mytablef
                                                                             un( :1  ) AS mytabletype ) FROM dual))

1p40d61qp4u6g        91986        30662        91943     30647.67          3 select * from x where a in (select * from table(SELECT CAST( mytablef
                                                                             un( :1  ) AS mytabletype ) FROM dual))
First some clarification as to why there are two sets of statments when only one is expected. Looking at the sql shared view
SQL> SELECT sql_id,
  2  child_number,
  3  address,
  4  executions,sql_text
  5  FROM v$sql
  6  WHERE sql_text LIKE 'select * from x where a in%'
  7  ORDER BY sql_text;

SQL_ID        CHILD_NUMBER ADDRESS          EXECUTIONS SQL_TEXT
------------- ------------ ---------------- ---------- ----------------------------------------------------------------------
1p40d61qp4u6g            1 00000002088B14D0      19997 select * from x where a in (select * from table(SELECT CAST( mytablefu
                                                       n( :1  ) AS mytabletype ) FROM dual))

1p40d61qp4u6g            0 00000002088B14D0          3 select * from x where a in (select * from table(SELECT CAST( mytablefu
                                                       n( :1  ) AS mytabletype ) FROM dual))
                
                
SQL> select sql_id,address,child_number,use_feedback_stats from V$SQL_SHARED_CURSOR where sql_id='1p40d61qp4u6g'  and address='00000002088B14D0';

SQL_ID        ADDRESS          CHILD_NUMBER USE_FEEDBACK_STATS
------------- ---------------- ------------ --------------------
1p40d61qp4u6g 00000002088B14D0            0 Y
1p40d61qp4u6g 00000002088B14D0            1 N
"Y" on use_feedback_stats means "A hard parse is forced so that the optimizer can reoptimize the query with improved cardinality estimates" (From Oracle docs).

The comparison is made on cpu time and shareable memory. First the cpu time comparison
Sum of cpu time from the SQL method test = 494924
Sum of cpu time from the PL/SQL method test = 6424993
It could be seen that PL/SQL method uses over 1198% more CPU time than the SQL method.

Comparing the shareable memory used by two methods
Sum of shareable memory from SQL method test = 194303
Sum of shareable memory from PL/SQL method test = 57496
It could be seen due to creation of multiple statements SQL method uses 238% more shareable memory than the PL/SQL method.

The comparison shows the two methods have trade-off between cpu and memory.