Sunday, December 16, 2012

java.sql.SQLRecoverableException: IO Error: Connection reset and /dev/urandom

Following stack trace could be seen when the application server was starting.
java.sql.SQLRecoverableException: IO Error: Connection reset
        at oracle.jdbc.driver.T4CStatement.executeForDescribe(T4CStatement.java:886)
        at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1167)
        at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1289)
        at oracle.jdbc.driver.OracleStatement.executeQuery(OracleStatement.java:1491)
        at oracle.jdbc.driver.OracleStatementWrapper.executeQuery(OracleStatementWrapper.java:406)
        
Caused by: java.net.SocketException: Connection reset
        at java.net.SocketInputStream.read(SocketInputStream.java:168)
        at oracle.net.ns.Packet.receive(Packet.java:300)
        at oracle.net.ns.DataPacket.receive(DataPacket.java:106)
        at oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:315)
        at oracle.net.ns.NetInputStream.read(NetInputStream.java:260)
        at oracle.net.ns.NetInputStream.read(NetInputStream.java:185)
        at oracle.net.ns.NetInputStream.read(NetInputStream.java:102)
        at oracle.jdbc.driver.T4CSocketInputStreamWrapper.readNextPacket(T4CSocketInputStreamWrapper.java:124)
        at oracle.jdbc.driver.T4CSocketInputStreamWrapper.read(T4CSocketInputStreamWrapper.java:80)
        at oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1137)
        at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:290)
        at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:192)
        at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531)
        at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:193)
        at oracle.jdbc.driver.T4CStatement.executeForDescribe(T4CStatement.java:873)
        ... 6 more
The application servers (6 of them) were running on RHEL 5 (x86_64) and this is a new deployment so there's less activity on the servers. The JDBC drivers were 11.2.0.3.
It seems the JDBC thin clients running on Linux system uses /dev/random for security purposes and the amount of activity (entropy - disorder in the system to be precise) determine how fast /dev/random can deliver random numbers. A good explanation of this is available here. By default JDBC uses /dev/random and could cause JDBC connection creation to hang when the /dev/random cannot generate random numbers fast enough.


One of the solutions (more available on the above mentioned link) is to make JDBC use /dev/urandom which can deliver random numbers with less activity than /dev/random. (According link above these values are "theoretically vulnerable to a cryptographic attack").
JDBC could be made to use /dev/urandom by specifying it as a system property (triple "/" is needed)
-Djava.security.egd=file:///dev/urandom
For some systems alternate formats of the URL may be needed and these are available in the metalink note given below.

Useful metalink notes
ODI Receives Timeout Messages On Linux x86 With Java 1.5 Onwards [ID 1277360.1]
Oracle Service Registry 11.1 Fails To Install with 'java.sql.SQLRecoverableException: IO Error: Connection reset' Error [ID 1343776.1]
JDBC 11.2.0.3 Application Fails With java.sql.SQLRecoverableException: IO Error: Connection reset [ID 1575238.1]
Oracle Service Registry 11.1 Fails To Install with 'java.sql.SQLRecoverableException: IO Error: Connection reset' Error [ID 1343776.1]

Wednesday, December 12, 2012

HAIP for Private Interconnect in 11gR2

Multiple NICs would be defined of private interconnect during installation (or after) on 11gR2. When there are multiple NICs defined for private interconnect oracle creates highly available IP (HAIP) for interconnect traffic.
From 11.2.0.2 onwards even if single NICs is used for private interconnect Oracle will use HAIP by default. Because of this v$cluster_interconnects view will show HAIP on 11gR2 (11.2.0.2 or later) compared to earlier versions such as 10g or 11gR1.
For example on 11gR1 IPs used for private interconnect could be identified using v$cluster_interconnects view
SQL> select * from gv$cluster_interconnects order by 1;

   INST_ID NAME            IP_ADDRESS       IS_ SOURCE
---------- --------------- ---------------- --- -------------------------------
         1 eth1            192.168.1.87     NO  Oracle Cluster Repository
         2 eth1            192.168.1.88     NO  Oracle Cluster Repository
Or on the database alert log will have following entries
Cluster communication is configured to use the following interface(s) for this instance
  192.168.1.88
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
However due to HAIP this is not the same on 11gR2, and what you specified during installation is not what you see on v$cluster_interconnects.
SQL> select * from gv$cluster_interconnects order by 1;

   INST_ID NAME            IP_ADDRESS       IS_ SOURCE
---------- --------------- ---------------- --- -------------------------------
         1 eth1:1          169.254.73.209   NO
         2 eth1:1          169.254.180.78   NO


Grid automatically creates the 169.254.*.* range IPs from the link local addresses that will be used for HAIP. These IPs are visible when ifconfig is executed and will appear on the same NIC assigned to private interconnect IP.
eth1      Link encap:Ethernet  HWaddr 08:00:27:7C:F4:94
          inet addr:192.168.1.88  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe7c:f494/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1997903 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1680302 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1349158236 (1.2 GiB)  TX bytes:965427458 (920.7 MiB)

eth1:1    Link encap:Ethernet  HWaddr 08:00:27:7C:F4:94
          inet addr:169.254.73.209  Bcast:169.254.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
Instead of getif option iflist could be used to list the HAIP associated interface.
oifcfg getif
eth0  192.168.0.0  global  public
eth1  192.168.1.0  global  cluster_interconnect

oifcfg iflist -p -n
eth0  192.168.0.0  PRIVATE  255.255.255.0
eth1  192.168.1.0  PRIVATE  255.255.255.0
eth1  169.254.0.0  UNKNOWN  255.255.0.0
Both ASM and DB alert log will also list the HAIP instead of the actual IP used during installation. From 11gR2 DB alert log
Cluster communication is configured to use the following interface(s) for this instance
  169.254.73.209
cluster interconnect IPC version:Oracle UDP/IP (generic)
From ASM alert log
Private Interface 'eth1:1' configured from GPnP for use as a private interconnect.
  [name='eth1:1', type=1, ip=169.254.73.209, mac=08-00-27-7c-f4-94, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62]
Public Interface 'eth0' configured from GPnP for use as a public interface.
  [name='eth0', type=1, ip=192.168.0.86, mac=08-00-27-c1-59-a3, net=192.168.0.0/24, mask=255.255.255.0, use=public/1]

Useful metalink notes
11gR2 Grid Infrastructure Redundant Interconnect and ora.cluster_interconnect.haip [ID 1210883.1]

Saturday, December 1, 2012

Using OracleConnectionWrapper to Clear Session Context in JDBC Connection Pools

Using application context to limit to the data visibility is one way VPDs are implemented. However in application with JDBC session pools this creates an additional challenge since session context is not cleared automatically when the connection close method is closed. It's only a logical close and connection return to the pool with session context values intact. Following java code illustrate this behavior. The JDBC libraries used are
Driver Name Oracle JDBC driver
Driver Version 11.2.0.3.0
Driver Major Version 11
Driver Minor Version 2
Database Major Version 11
Database Minor Version 2
The DBConnection class which is the connection pool class is the same class used earlier and not listed below.
public class Test {

    public static void main(String[] args) {

        try {

            DBConnection pool = new DBConnection("sbx", "sbx", "192.168.0.99", "1521", "fgacdb");
            
            System.out.println("setting context");
            testSet(pool);
            System.out.println("context set done connection closed");
            Thread.sleep(1 * 5 * 1000);
            System.out.println("getting from pool after close");
            testGet(pool);
            System.out.println("explicit clearing");
            testClear(pool);
            System.out.println("explicit clearing done");
            Thread.sleep(1 * 5 * 1000);
            System.out.println("getting context value after clear");
            testGet(pool);

        } catch (Exception ex) {

            ex.printStackTrace();
        }

    }

    public static void testSet(DBConnection pool) throws SQLException {

        Connection[] cons = new Connection[10];

        for (int i = 0; i < 10; i++) {

            cons[i] = pool.getConnection();
        }

        for (int i = 0; i < 10; i++) {
            try {
                Connection con = cons[i];

                CallableStatement cls = con.prepareCall("{call fgac_ctx_pkg.fgac_set_context(?,?)}");
                cls.setString(1, "user_id");
                cls.setString(2, "user " + i);
                cls.executeUpdate();
                cls.close();
                con.close();

            } catch (SQLException ex) {
                ex.printStackTrace();
            }
        }
    }

    public static void testGet(DBConnection pool) throws SQLException {

        Connection[] cons = new Connection[10];

        for (int i = 0; i < 10; i++) {

            cons[i] = pool.getConnection();
        }

        for (int i = 0; i < 10; i++) {
            try {
                Connection con = cons[i];

                PreparedStatement pr = con.prepareStatement("select sys_context ('fgac_ctx','user_id') from dual");
                ResultSet rs = pr.executeQuery();

                while (rs.next()) {

                    System.out.println(rs.getString(1));
                }

                rs.close();
                pr.close();
                con.close();

            } catch (SQLException ex) {
                ex.printStackTrace();
            }
        }
    }

    public static void testClear(DBConnection pool) throws SQLException {

        Connection[] cons = new Connection[10];

        for (int i = 0; i < 10; i++) {

            cons[i] = pool.getConnection();
        }

        for (int i = 0; i < 10; i++) {
            try {
                Connection con = cons[i];

                CallableStatement cls = con.prepareCall("{call fgac_ctx_pkg.fgac_clear_context(?)}");
                cls.setString(1, "user_id");

                cls.executeUpdate();

                cls.close();
                con.close();

            } catch (SQLException ex) {
                ex.printStackTrace();
            }
        }
    }
}
Code uses following PL/SQL package
create or replace package fgac_ctx_pkg is

procedure fgac_set_context( i_attribute in varchar2, value in varchar2 );
procedure fgac_clear_context (i_attribute in varchar2 );

end fgac_ctx_pkg;
/

create or replace package body fgac_ctx_pkg is

procedure fgac_set_context
( i_attribute in varchar2, value in varchar2 )
is
begin
dbms_Session.set_context('fgac_ctx',i_attribute, value);
end;

procedure fgac_clear_context
( i_attribute in varchar2 )
is
begin
dbms_Session.clear_context('fgac_ctx',attribute => i_attribute);
end;

end fgac_ctx_pkg;
/

create or replace context fgac_ctx using FGAC_CTX_PKG;
Running the code would give the following output
setting context
context set done connection closed
getting from pool after close
user 2
user 1
user 0
user 3
user 4
user 5
user 7
user 6
user 8
user 9
explicit clearing
explicit clearing done
getting context value after clear
null
null
null
null
null
null
null
null
null
null
It could be seen from the output that after setting session context and calling the connection close method, it is still possible to get the session context values when connections are borrowed from the connection pool. The solution is to explicitly call the session clear method. After the explicit clear the session context values are no longer there hence the null output.


Explicitly setting and clearing maybe feasible for a small program but when application is already developed and trying to add these additional lines may require considerable amount of work and testing. This is where the OracleConnectionWrapper comes in handy which allows to wrap the connection close method and allows to call the explicit clear method from one place. This requires minimum work for an existing application. Similarly setting the context value would be done by overriding the getConnection method of the connection pool class.
Extend OracleConnectionWrapper to create a customized wrapper.
public class MyOracleConnectionWrapper extends OracleConnectionWrapper {

    public MyOracleConnectionWrapper(Connection con) {

        super((OracleConnection) con);
    }

    @Override
    public void close() throws SQLException {


        System.out.println("clearing");

        CallableStatement cls = super.prepareCall("{call fgac_ctx_pkg.fgac_clear_context(?)}");
        cls.setString(1, "user_id");
        cls.executeUpdate();
        cls.close();
        super.close();
    }
}
Modify Connection Pool class by overriding getConnection method and populating session context with user id using application specific methods.
public class DBConnection {

    private PoolDataSource pds = null;

    public DBConnection(String username, String password, String host, String port, String sid) {
        try {
            pds = PoolDataSourceFactory.getPoolDataSource();
            pds.setUser(username);
            pds.setPassword(password);
            pds.setURL("jdbc:oracle:thin:@" + host + ":" + port + ":" + sid);
            pds.setConnectionFactoryClassName("oracle.jdbc.pool.OracleDataSource");
            pds.setInitialPoolSize(10);
            pds.setMinPoolSize(10);
            pds.setMaxPoolSize(15);

        } catch (SQLException ex) {
            ex.printStackTrace();
        }
    }

    public Connection getConnection() throws SQLException {

        MyOracleConnectionWrapper con = new MyOracleConnectionWrapper((OracleConnection) pds.getConnection());

        CallableStatement cls = con.prepareCall("{call fgac_ctx_pkg.fgac_set_context(?,?)}");
        cls.setString(1, "user_id");
        cls.setString(2, "user " + applicationSessionObj.getUserID());
        cls.executeUpdate();
        cls.close();

        return con;

    }
}
With this code when a connection is borrowed from the connection pool it will have session context values associated with that user id and when connection close method is called session context will be cleared.
Similar to session context auto commit set at connection level is also not reset when returned to pool. To examine this behavior change for loop code where connections are borrowed from pool as follows on testSet method
for (int i = 0; i < 10; i++) {

            cons[i] = pool.getConnection();
            System.out.println(cons[i].getAutoCommit());
            cons[i].setAutoCommit(false);

        }
and on testGet method
 for (int i = 0; i < 10; i++) {

            cons[i] = pool.getConnection();
            System.out.println(cons[i].getAutoCommit());
        }
This will show the status of auto commit when connections are initially borrowed from the connection pool versus status after auto commit is set to false, connection closed and returned to pool and borrowed from pool again.

Friday, November 30, 2012

ORA-01858: a non-numeric character was found where a numeric was expected

An application related SQL query with following line
AND TO_DATE(DEP_DATE, 'DD-MM-YYYY') = TO_DATE(SYSDATE+360 , 'DD-MM-YYYY')
was failing on some databases while succeeding on some. Though the logic seems at odd (why to_date a date type column) the issue was why it was failing on some databases with ORA-01858 while succeeding in others when executed using SQLPlus. Even on the database the query was failing it was possible to execute the query using JDBC (reason it was working on the application without the error).
After comparing and contrasting the databases the query succeeded and failed it came down to NLS_DATE_FORMAT parameter set on the bash shell environment on some of the databases.
echo $NLS_LANG
English_United Kingdom.AL32UTF8
$ export NLS_DATE_FORMAT='DD-Mon-YYYY HH24:MI:SS'
$ sqlplus / as sysdba
SQL> select to_date(sysdate,'YYYY-MM-DD') from dual;
select to_date(sysdate,'YYYY-MM-DD') from dual
*
ERROR at line 1:
ORA-01858: a non-numeric character was found where a numeric was expected

$ unset NLS_DATE_FORMAT
$ sqlplus / as sysdba

SQL> select to_date(sysdate,'YYYY-MM-DD') from dual;

TO_DATE(SYSDATE,'Y
------------------
12-NOV-29


It is possible to run the to_date with NLS_DATE_FORMAT set but this must match the format string.
$ export NLS_DATE_FORMAT='DD-MM-YYYY'
$ sqlplus / as sysdba

--different format to that of NLS_DATE_FORMAT
SQL> select to_date(sysdate,'YYYY-MM-DD') from dual;
select to_date(sysdate,'YYYY-MM-DD') from dual
*
ERROR at line 1:
ORA-01830: date format picture ends before converting entire input string

--same format as the NLS_DATE_FORMAT
SQL> select to_date(sysdate,'DD-MM-YYYY') from dual;

TO_DATE(SY
----------
29-11-2012
NLS_DATE_FORMAT comes into play if NLS_LANG is also set. If NLS_LANG is unset then NLS_DATE_FORMAT is ignored.

Useful metalink notes
RMAN Backup Fails With RMAN-03009 ORA-01858 ORA-01861 [ID 744047.1]
Query using TO_DATE TO_CHAR failing with ORA-01858 or ORA-01843 [ID 790098.1]
The Priority of NLS Parameters Explained (Where To Define NLS Parameters) [ID 241047.1]
OERR: ORA 1858 "a non-numeric character was found where a numeric was expected" [ID 19182.1]
How To Set a NLS Session Parameter At Database Or Schema Level For All Connections? [ID 251044.1]

Monday, November 26, 2012

ORA-15477 cannot communicate with the volume driver

ORA-15477: cannot communicate with the volume driver could be seen when setting the compatible.admv value on a ASM disk group. The system was running 11.2.0.3 (11.2.0.3.4) RAC on RHEL 6 (2.6.32-220.el6.x86_64) and it's a brand new installation of RAC (1488004.1).
SQL> alter diskgroup flash set attribute 'compatible.advm'='11.2';
alter diskgroup flash set attribute 'compatible.advm'='11.2'
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15242: could not set attribute compatible.advm
ORA-15238: 11.2 is not a valid value for attribute compatible.advm
ORA-15477: cannot communicate with the volume driver
It seems the ACFS/ADMV modules are not loaded with system reboot as
[root@rhel6m1 ~]# lsmod | grep oracle
didn't return anything.
Reinstall acfs/admv modules manually (1371067.1)
[root@rhel6m1 bin]# ./acfsroot install
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9312: Existing ADVM/ACFS installation detected.
ACFS-9314: Removing previous ADVM/ACFS installation.
ACFS-9315: Previous ADVM/ACFS components successfully removed.
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies - this may take some time.
ACFS-9154: Loading 'oracleoks.ko' driver.
ACFS-9154: Loading 'oracleadvm.ko' driver.
ACFS-9154: Loading 'oracleacfs.ko' driver.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9309: ADVM/ACFS installation correctness verified.
After this acfs modules will be loaded
lsmod  | grep oracle
oracleacfs           1857627  2
oracleadvm            238976  6
oracleoks             330757  2 oracleacfs,oracleadvm
and change of compatible.admv would proceed without an error and acfs volumes could be created
SQL> alter diskgroup flash set attribute 'compatible.advm'='11.2';
Diskgroup altered.

SQL> alter diskgroup flash add volume volume1 size 1g;
Diskgroup altered.

SQL> select volume_name,volume_device from v$asm_volume;

VOLUME_NAME                    VOLUME_DEVICE
------------------------------ --------------------
VOLUME1                        /dev/asm/volume1-149

# mkfs -t acfs /dev/asm/volume1-149
mkfs.acfs: version                   = 11.2.0.3.0
mkfs.acfs: on-disk version           = 39.0
mkfs.acfs: volume                    = /dev/asm/volume1-149
mkfs.acfs: volume size               = 1073741824
mkfs.acfs: Format complete.

acfsutil registry -a /dev/asm/volume1-149 /opt/acfsvol

#mount -t acfs  /dev/asm/volume1-149 /opt/acfsvol



However after the reboot of the servers the acfs modules do not get auto loaded and following could be seen on ASM alert log
SUCCESS: ALTER DISKGROUP ALL MOUNT /* asm agent call crs *//* {0:0:2} */
SQL> ALTER DISKGROUP ALL ENABLE VOLUME ALL /* asm agent *//* {0:0:2} */
kfvxVolOnOff: Cannot open device file
ORA-15032: not all alterations performed
ORA-15477: cannot communicate with the volume driver
ERROR: ALTER DISKGROUP ALL ENABLE VOLUME ALL /* asm agent *//* {0:0:2} */
To fix this run acfsroot enable. This command requires acfs to be started before running it
[root@rhel6m1 grid]# $GI_HOME/bin/acfsroot enable
ACFS-9167: ADVM/ACFS is not installed or loaded. Run 'acfsroot install'.

[root@rhel6m1 grid]# $GI_HOME/bin/acfsload start -s
[root@rhel6m1 grid]# $GI_HOME/bin/acfsroot enable
ACFS-9376: Adding ADVM/ACFS drivers resource succeeded.
CRS-2672: Attempting to start 'ora.drivers.acfs' on 'rhel6m1'
CRS-2676: Start of 'ora.drivers.acfs' on 'rhel6m1' succeeded
ACFS-9380: Starting ADVM/ACFS drivers resource succeeded.
ACFS-9368: Adding ACFS registry resource succeeded.
CRS-2672: Attempting to start 'ora.registry.acfs' on 'rhel6m1'
CRS-2676: Start of 'ora.registry.acfs' on 'rhel6m1' succeeded
ACFS-9372: Starting ACFS registry resource succeeded.
This issue is mentioned in (1488004.1 and 1486208.1)
Afterwards acfs modules were auto loading on reboot and acfs volumes were getting auto mounted without a problem.

Useful metalink notes
ACFS Supported On OS Platforms. [ID 1369107.1]
ACFS Filesystem Not Mounted After Host Reboot [ID 1473875.1]
How To Install/Reinstall Or Deinstall ACFS Modules/Installation Manually? [ID 1371067.1]
11.2.0.3.3 Grid Infrastructure Patchset: 13919095 Installation On RH 6.3 Fails With ACFS-9459 . [ID 1493491.1]
ORA.REGISTRY.ACFS Resource Appears As OFFLINE On A Brand New 11.2 RAC Installation/Configuration. [ID 1488004.1]
ACFS/ADVM is NOT started automatically after node reboot or after CRS is restarted in non-RAC environment [ID 886407.1]
“ora.drivers.acfs” Resource Was Not Configured Therefore RAC ACFS Filesystem Is Not Mounting During The Reboot. [ID 1486208.1]

Related Post
Creating and Removing ACFS

Friday, November 16, 2012

Applying Patch Set using Standby-First

According to metalink note 278641.1 before applying a patch to a physical standby environment log transport must be stopped on the primary site. Although the note says to shutdown all standby instance, it should still be possible to do a rolling patch on the standby environment if patch allows it. However recovery may need to be stopped before patching and since redo transport is stopped, there won't be anything to recover anyway.
From 11.2.0.1 onwards if the patch set says it's applicable in a standby-first manner then these patches could be applied on standby without stopping redo transport nor recovery on the standby database.
This post shows the highlights of applying GI PSU 11.2.0.3.4 (which is a rolling and Standby-First applicable patch) to a physical data guard environment.
Before the patch apply both standby and primary are at the same patch level.primary
rac5 psu]$ /opt/app/oracle/product/11.2.0/db_2/OPatch/opatch lsinventory -local | grep "Database Patch Set Update"
Patch description:  "Database Patch Set Update : 11.2.0.3.3 (13923374)"
Sub-patch  13696216; "Database Patch Set Update : 11.2.0.3.2 (13696216)"
Sub-patch  13343438; "Database Patch Set Update : 11.2.0.3.1 (13343438)" 
Standby
rac5b psu]$ /opt/app/oracle/product/11.2.0/db_2/OPatch/opatch lsinventory -local | grep "Database Patch Set Update"
Patch description:  "Database Patch Set Update : 11.2.0.3.3 (13923374)"
Sub-patch  13696216; "Database Patch Set Update : 11.2.0.3.2 (13696216)"
Sub-patch  13343438; "Database Patch Set Update : 11.2.0.3.1 (13343438)"
Identify the apply instance and start patch applying on the none apply instance first
show database rac11g2s

Database - rac11g2s

  Role:            PHYSICAL STANDBY
  Intended State:  APPLY-ON
  Transport Lag:   0 seconds
  Apply Lag:       0 seconds
  Real Time Query: OFF
  Instance(s):
    rac11g2s1 (apply instance)
    rac11g2s2

Database Status:
SUCCESS
In this case apply instance was rac11g2s1 running on rac4b hence patching will being on rac5b.
Once patched rac5b will be at a higher patch level than rac4b and all nodes on primary
rac5b psu]$ /opt/app/oracle/product/11.2.0/db_2/OPatch/opatch lsinventory -local | grep "Database Patch Set Update"
Patch description:  "Database Patch Set Update : 11.2.0.3.4 (14275605)"
Sub-patch  13923374; "Database Patch Set Update : 11.2.0.3.3 (13923374)"
Sub-patch  13696216; "Database Patch Set Update : 11.2.0.3.2 (13696216)"
Sub-patch  13343438; "Database Patch Set Update : 11.2.0.3.1 (13343438)"
Switch the apply instance to patched node and patch the remaining nodes as well
DGMGRL> edit database rac11g2s set state ='APPLY-ON' with apply instance = 'rac11g2s2';
Succeeded.
DGMGRL> show database rac11g2s
Database - rac11g2s

  Role:            PHYSICAL STANDBY
  Intended State:  APPLY-ON
  Transport Lag:   0 seconds
  Apply Lag:       0 seconds
  Real Time Query: OFF
  Instance(s):
    rac11g2s1
    rac11g2s2 (apply instance)

Database Status:
SUCCESS

rac4b ~]$ /opt/app/oracle/product/11.2.0/db_2/OPatch/opatch lsinventory -local | grep "Database Patch Set Update"
Patch description:  "Database Patch Set Update : 11.2.0.3.4 (14275605)"
Sub-patch  13923374; "Database Patch Set Update : 11.2.0.3.3 (13923374)"
Sub-patch  13696216; "Database Patch Set Update : 11.2.0.3.2 (13696216)"
Sub-patch  13343438; "Database Patch Set Update : 11.2.0.3.1 (13343438)"
At this stage all Oracle binaries in standby environment are patched.
It is possible to run the data guard configuration like this with primary and standby at different patch level but there are limitations to how long this configuration will be supported (less than 1 month) and these are explained in 1265700.1. This gives time to evaluate the patch either by creating a snapshot standby or with role transition.


Once patch is evaluated on standby environment apply it on the primary site as well. Since this patch is applicable in a rolling fashion log transport will be active at all times during the patch application.
DGMGRL> show database rac11g2

Database - rac11g2

  Role:            PRIMARY
  Intended State:  TRANSPORT-ON
  Instance(s):
    rac11g21
    rac11g22

Database Status:
SUCCESS
After patching rac5
rac5 ~]$ /opt/app/oracle/product/11.2.0/db_2/OPatch/opatch lsinventory -local | grep "Database Patch Set Update"
Patch description:  "Database Patch Set Update : 11.2.0.3.4 (14275605)"
Sub-patch  13923374; "Database Patch Set Update : 11.2.0.3.3 (13923374)"
Sub-patch  13696216; "Database Patch Set Update : 11.2.0.3.2 (13696216)"
Sub-patch  13343438; "Database Patch Set Update : 11.2.0.3.1 (13343438)"
Shutdown the remaining instances and apply the patch (rac4 in this case) and at the same time execute the post-installation tasks on the already patched instance.
After patching rac4
rac4 psu]$ /opt/app/oracle/product/11.2.0/db_2/OPatch/opatch lsinventory -local | grep "Database Patch Set Update"
Patch description:  "Database Patch Set Update : 11.2.0.3.4 (14275605)"
Sub-patch  13923374; "Database Patch Set Update : 11.2.0.3.3 (13923374)"
Sub-patch  13696216; "Database Patch Set Update : 11.2.0.3.2 (13696216)"
Sub-patch  13343438; "Database Patch Set Update : 11.2.0.3.1 (13343438)"

Useful metalink notes
Oracle Patch Assurance - Data Guard Standby-First Patch Apply [ID 1265700.1]
Mixed Oracle Version support with Data Guard Redo Transport Services [ID 785347.1]
How do you apply a Patchset,PSU or CPU in a Data Guard Physical Standby configuration [ID 278641.1]

Tuesday, November 6, 2012

warning: ignoring old commands for target `pcscfg.cfg'

Following could be seen when applying PSU 11.2.0.3.4 on a DB home
OPatch found the word "warning" in the stderr of the make command.
Please look at this stderr. You can re-run this make command.
Stderr output:
ins_precomp.mk:19: warning: overriding commands for target `pcscfg.cfg'
/opt/app/oracle/product/11.2.0/ent11.2.0.3/precomp/lib/env_precomp.mk:2160: warning: ignoring old commands for target `pcscfg.cfg'
/opt/app/oracle/product/11.2.0/ent11.2.0.3/precomp/lib/ins_precomp.mk:19: warning: overriding commands for target `pcscfg.cfg'
/opt/app/oracle/product/11.2.0/ent11.2.0.3/precomp/lib/env_precomp.mk:2160: warning: ignoring old commands for target `pcscfg.cfg'
This is only a warning and patch has been successfully applied.

Useful metalink notes
Opatch warning: overriding commands for target xxxx [ID 1448337.1]

Related Post
ins_srvm.mk: warning: overriding commands for target libsrvm11.so

Monday, October 8, 2012

Guarantee Restore Point Leads to High DB File Sequential Read Waits on Undo Tablespace in 11gR2

Guarantee restore points are useful to have in test environments which allows easy rollback of the database so test could be run again without having to have a new data sets for inserts that can cause primary key violation.
However it seems having a guarantee restore point can cause excessive db file sequential read waits on the undo tablespace for insert statements. This could alter the performance characteristics of test.
Following test case could be used to recreate this behavior. The test environment used here is 11gR2 (11.2.0.3.3).
Create the table for the test case
create table ses_lock (sid varchar2(100) primary key);
Open 3 sql plus session (called 1,2 and 3 from now on) and run the following pl/sql blocks on each one.
On session 1 create and hold a lock to start the concurrent insert process
declare
    i pls_integer;
begin
    i := dbms_lock.request(10);
end;
/
On session 2 run the PL/SQL for the insert statment
declare
    i pls_integer;
begin
    i := dbms_lock.request(10,DBMS_LOCK.S_MODE);

    for i in 1 .. 100000
    loop
     insert into ses_lock values (dbms_random.string('x',36));
     commit;
    end loop;
     i := dbms_lock.release(10);
end;
/
Run the same on sesison 3
declare
    i pls_integer;
begin
    i := dbms_lock.request(10,DBMS_LOCK.S_MODE);

    for i in 1 .. 100000
    loop
     insert into ses_lock values (dbms_random.string('x',36));
     commit;
    end loop;
     i := dbms_lock.release(10);
end;
/
and finaly on session 1 release the lock
declare
    j pls_integer;
begin
    j := dbms_lock.release(10);
end;
/
This will start concurrent insert into the table.
Investigating the active session history during this period for the insert sql shows that db file sequential reads are on datafile 14 which is the datafile where the table resides.
select inst_id,p1,p1text,sql_id,count(*)
from gv$active_session_history
where sample_time >  to_date('2012-09-17 11:37','YYYY-MM-DD HH24:MI')
and event ='db file sequential read'
group by inst_id, p1, p1text, sql_id
order by 5 desc,1 ,2;

INST_ID  P1  P1TEXT SQL_ID        COUNT(*)         
-------- --- ------ ------------- ---------
1        14  file#  7pj2bbx25m1rf 30                     

SQL> select sql_id,sql_text from v$sql where sql_id='7pj2bbx25m1rf';

SQL_ID        SQL_TEXT
------------- ------------------------------------------------------------
7pj2bbx25m1rf INSERT INTO SES_LOCK VALUES (DBMS_RANDOM.STRING('x',36))

SQL> select file_id,tablespace_name from dba_data_files where file_id=14;

FILE_ID TABLESPACE_NAME
------- -----------------
14      BX2KTBS
Delete the rows in the table and create a guarantee restore point
SQL> delete from ses_lock;
200000 rows deleted.

SQL> commit;
It is possible to create a guarantee restore point in 11gR2 without having flashback on. Before guarantee restore point is created
SQL> select flashback_on from v$database;

FLASHBACK_ON
-------------
NO
After guarantee restore point
SQL> create restore point first_restore guarantee flashback database;

SQL> select flashback_on from v$database;

FLASHBACK_ON
------------------
RESTORE POINT ONLY
Create 3 sql*plus sessions mentioned above and run the insert again.
Querying the active session history afterwards shows that during this test the db file sequential read waits were on the data file used for the undo tablespace and much higher number of waits compared to test without the guarantee restore point.
select inst_id,p1,p1text,sql_id,count(*)
from gv$active_session_history
where sample_time >  to_date('2012-09-17 12:02','YYYY-MM-DD HH24:MI')
and event ='db file sequential read'
group by inst_id, p1, p1text, sql_id
order by 5 desc,1 ,2;

INST_ID  P1  P1TEXT SQL_ID        COUNT(*)         
-------- --- ------ ------------- ---------
1        3   file#  7pj2bbx25m1rf 266    

SQL> select file_id,tablespace_name from dba_data_files where file_id in (3,14);

FILE_ID  TABLESPACE_NAME
-------- ---------------
3        UNDOTBS1
14       BX2KTBS
Dropped the restore point and ran the test again and this time there was no db file sequential waits and as soon as the restore point is created and test carried out see lots of db file sequential read waits on undo tablespace data file. Below is the output of querying active session history for the second test with restore point thus confirming that observed behavior is repeatable with the test case
select inst_id,p1,p1text,sql_id,count(*)
from gv$active_session_history
where sample_time >  to_date('2012-09-17 12:20','YYYY-MM-DD HH24:MI')
and event ='db file sequential read'
group by inst_id, p1, p1text, sql_id
order by 5 desc,1 ,2;

INST_ID  P1  P1TEXT SQL_ID        COUNT(*)         
-------- --- ------ ------------- ---------
1        3   file#  7pj2bbx25m1rf 193    
-->


SQL tracing could also be used to demonstrate this problem. This is useful if the database is standard edition where use of active session history view is not an option because of license restrictions.
Enable sql tracing for the session
alter session set max_dump_file_size=UNLIMITED;
alter session set  tracefile_identifier='before_grp';
alter session set events '10046 trace name context forever, level 12';
Run the insert statement, this time only using on session
declare
    i pls_integer;
begin
    i := dbms_lock.request(10,DBMS_LOCK.S_MODE);

    for i in 1 .. 100000
    loop
     insert into ses_lock values (dbms_random.string('x',36));
     commit;
    end loop;
     i := dbms_lock.release(10);
end;
/
Formatted trace file output shows no waits for db file sequential read wait event
SQL ID: dqhz25akp92bp Plan Hash: 0

INSERT INTO SES_LOCK VALUES (DBMS_RANDOM.STRING('x',36))

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute 100000     14.95      14.77          0       2746     517465      100000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   100001     14.95      14.77          0       2746     517465      100000

Misses in library cache during parse: 1
Optimizer mode: FIRST_ROWS
Parsing user id: 84     (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  LOAD TABLE CONVENTIONAL  (cr=155 pr=5 pw=0 time=35733 us)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  Disk file operations I/O                        2        0.00          0.00
  latch: In memory undo latch                     1        0.00          0.00
  log file switch completion                      2        0.00          0.01
  latch free                                      1        0.00          0.00
********************************************************************************
Delete all rows from the table and create restore point
SQL> delete from ses_lock;
100000 rows deleted.

SQL>commit;
Commit complete.

SQL> create restore point first_restore guarantee flashback database;
Run the test again
alter session set max_dump_file_size=UNLIMITED;
alter session set  tracefile_identifier='after_grp';
alter session set events '10046 trace name context forever, level 12';

declare
    i pls_integer;
begin
    i := dbms_lock.request(10,DBMS_LOCK.S_MODE);

    for i in 1 .. 100000
    loop
     insert into ses_lock values (dbms_random.string('x',36));
     commit;
    end loop;
     i := dbms_lock.release(10);
end;
/
and formatted output of trace file shows high number of db file sequential read waits compared to previous test
SQL ID: dqhz25akp92bp Plan Hash: 0

INSERT INTO SES_LOCK VALUES (DBMS_RANDOM.STRING('x',36))

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute 100000     20.71      22.50       3507        107     513257      100000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   100001     20.71      22.50       3507        107     513257      100000

Misses in library cache during parse: 1
Optimizer mode: FIRST_ROWS
Parsing user id: 84     (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  LOAD TABLE CONVENTIONAL  (cr=1 pr=0 pw=0 time=1844 us)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  Disk file operations I/O                        2        0.00          0.00
  db file sequential read                      3507        0.11          3.78
  log file switch (checkpoint incomplete)         3        0.00          0.00
  latch: redo allocation                          1        0.00          0.00
  log file switch completion                      1        0.06          0.06
********************************************************************************
Inspecting trace file itself could see details of the wait event and wait is on file#=3
WAIT #46953194410944: nam='db file sequential read' ela= 277 file#=3 block#=16505 blocks=1 obj#=0 tim=1349341486136362
WAIT #46953194410944: nam='db file sequential read' ela= 263 file#=3 block#=17317 blocks=1 obj#=0 tim=1349341486138981
WAIT #46953194410944: nam='db file sequential read' ela= 330 file#=3 block#=16955 blocks=1 obj#=0 tim=1349341486149736
WAIT #46953194410944: nam='db file sequential read' ela= 387 file#=3 block#=17076 blocks=1 obj#=0 tim=1349341486152818
WAIT #46953194410944: nam='db file sequential read' ela= 15 file#=3 block#=16839 blocks=1 obj#=0 tim=1349341486161143
File#=3 is the datafile used with the undo tablespace.
Drop the restore point and the db file sequential read event disappear, output below after restore point is dropped
SQL ID: dqhz25akp92bp Plan Hash: 0

INSERT INTO SES_LOCK VALUES (DBMS_RANDOM.STRING('x',36))

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute 100000     15.44      14.60          0         67     593197      100000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   100001     15.44      14.60          0         67     593197      100000

Misses in library cache during parse: 0
Optimizer mode: FIRST_ROWS
Parsing user id: 84     (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  LOAD TABLE CONVENTIONAL  (cr=1 pr=0 pw=0 time=1613 us)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  log file switch (checkpoint incomplete)         4        0.00          0.01
  latch: cache buffers chains                     1        0.00          0.00
  Disk file operations I/O                        1        0.00          0.00
********************************************************************************
SR was raised with Oracle and response was that "this is similar to Bug 12921374 : GURANTEED RESTORE POINT CAUSES MANY PHYSICAL READS." However there has not been any movement on this bug since Feb 2012 and there's no solution or workaround. As a result of this SR a new bug report has been created which is available under "Bug 14671654 : GUARANTEE RESTORE POINT PROVOKES HIGH"
The same test was carried out on 11.1.0.7 (11.1.0.7.12) and it didn't have the same behavior as 11gR2. So the issue is localized to 11gR2.(Update 2014/11/21-11.1 also exhibit the same behavior. Depending on when the table is created and when the restore point is created the location of the wait event on the trace file differ. Looking only at the insert statement may result in missing the high waits shown under the commit. Either way total number of waits for the test case on 11.1 is not different to that of 11.2)

Update 15 July 2013
The issue is also there on 12c (12.1.0.1). Below is the tkprof output after the same test done on a 12c.Before restore point creation
SQL ID: dqhz25akp92bp Plan Hash: 0

INSERT INTO SES_LOCK
VALUES
 (DBMS_RANDOM.STRING('x',36))


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute 100000     41.03      40.49          1       3272     515757      100000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   100001     41.03      40.50          1       3272     515757      100000

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 103     (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  LOAD TABLE CONVENTIONAL  (cr=783 pr=7 pw=0 time=195740 us)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  Disk file operations I/O                        6        0.00          0.00
  db file sequential read                         3        0.00          0.00
  latch: shared pool                              8        0.00          0.00
  log file switch (private strand flush incomplete)
                                                  1        0.01          0.01
  control file sequential read                   20        0.00          0.00
  Data file init write                           12        0.00          0.00
  db file single write                            1        0.00          0.00
  control file parallel write                     3        0.00          0.00
  rdbms ipc reply                                 1        0.00          0.00
********************************************************************************
After restore point created
SQL ID: dqhz25akp92bp Plan Hash: 0

INSERT INTO SES_LOCK
VALUES
 (DBMS_RANDOM.STRING('x',36))


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute 100000     43.42      42.15       3484        660     512898      100000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   100001     43.43      42.16       3484        660     512898      100000

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 103     (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  LOAD TABLE CONVENTIONAL  (cr=226 pr=0 pw=0 time=56775 us)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  Disk file operations I/O                        2        0.00          0.00
  db file sequential read                      3484        0.28          0.47
  log file switch (checkpoint incomplete)         1        0.01          0.01
  log file switch completion                      1        0.01          0.01
********************************************************************************
All the waits were on the undo tablespace same as before.

Update 14 July 2014
Same behavior is also observed on 11.2.0.4. Before guarantee restore point
SQL ID: dqhz25akp92bp Plan Hash: 0

INSERT INTO SES_LOCK
VALUES
 (DBMS_RANDOM.STRING('x',36))


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.01          0          0          0           0
Execute 100000     72.74      81.65          6       3595     426842      100000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   100001     72.75      81.66          6       3595     426842      100000

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 50     (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  LOAD TABLE CONVENTIONAL  (cr=164 pr=8 pw=0 time=103136 us)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  enq: IV -  contention                          10        0.00          0.00
  db file sequential read                         6        0.00          0.02
  Disk file operations I/O                        1        0.00          0.00
  row cache lock                                  7        0.00          0.00
  latch: shared pool                              4        0.00          0.00
  latch: enqueue hash chains                      3        0.00          0.00
  latch free                                      3        0.00          0.00
  latch: redo allocation                          1        0.00          0.00
  log file switch (checkpoint incomplete)         1        0.00          0.00
  log file switch completion                      1        0.00          0.00
********************************************************************************
After guarantee restore point created
SQL ID: dqhz25akp92bp Plan Hash: 0

INSERT INTO SES_LOCK
VALUES
 (DBMS_RANDOM.STRING('x',36))


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute 100000     82.15      94.40       5143       3615     434964      100000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   100001     82.16      94.41       5143       3615     434964      100000

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 50     (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  LOAD TABLE CONVENTIONAL  (cr=5 pr=3 pw=0 time=13633 us)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  row cache lock                                  3        0.00          0.00
  Disk file operations I/O                        3        0.00          0.00
  db file sequential read                      5143        0.52          2.31
  KSV master wait                                 2        0.00          0.00
  ASM file metadata operation                     1        0.00          0.00
  latch free                                      1        0.00          0.00
  log file switch (checkpoint incomplete)         2        0.01          0.02
********************************************************************************
Update 21 November 2014
Final reply from Oracle "this is not a bug but expected behavior when restore point is in use".

Thursday, October 4, 2012

Use of WM_CONCAT Can Exhaust Temporary Tablespace in 11gR2 & 10gR2

WM_CONCAT function could be used aggregate multiple row values into single row. Below is a short example
create table x (a number, b varchar2(10));

SQL> insert into x values(1,'a');
1 row created.
SQL> insert into x values (2,'b');
1 row created.
SQL> insert into x values(1,'c');
1 row created.
SQL> insert into x values(2,'d');
1 row created.
SQL> commit;

SQL> select a,wm_concat(b) as concats from x group by a;

         A CONCATS
---------- ----------
         1 a,c
         2 b,d
However wm_concat is an undocumented function not for direct use by customers (more on this later on) but that doesn't prevent developers from using this function.
However there's a danger that extensive invocation of this function could exhaust the temporary tablespace. Primary reason for this is that wm_concat has a sort aggregation in it that result in temporary tablespace segments being used
SQL> explain plan for select wm_concat('abc'||'def') from dual;

SQL>  select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------
Plan hash value: 3910148636

-----------------------------------------------------------------
| Id  | Operation        | Name | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------
|   0 | SELECT STATEMENT |      |     1 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE  |      |     1 |            |          |
|   2 |   FAST DUAL      |      |     1 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------
(no matter what, tested db instance had 7G SGA, 3G PGA and this was the only test case running, plenty of memory to do the sort in memory) but these are not released until the connection is physically close, which means exit from sql*plus or closing a connection pool in a the case of jdbc connection pools. This effect could be demonstrated with following test case using 11gR2 (11.2.0.3.3) Linux 64-bit instance.
Create a temporary tablespace of 5M and assign it as the default temporary tablespace for the test user
create temporary tablespace temp5m TEMPFILE size 5m autoextend on next 1m maxsize 5m;
alter user asanga temporary tablespace temp5m;
Run the sql with wm_concat from 5 different sql*plus session and monitor the temporary tablespace segment usage.
Temporary tablespace usage before test starts
SELECT A.inst_id,
  A.tablespace_name TABLESPACE,
  D.mb_total,
  SUM (A.used_blocks * D.block_size) / 1024 / 1024 mb_used,
  D.mb_total         - SUM (A.used_blocks * D.block_size) / 1024 / 1024 mb_free
FROM gv$sort_segment A,
  (SELECT B.INST_ID,
    B.name,
    C.block_size,
    SUM (C.bytes) / 1024 / 1024 mb_total
  FROM gv$tablespace B,
    gv$tempfile C
  WHERE B.ts#  = C.ts#
  AND c.inst_id=b.inst_id
  GROUP BY B.INST_ID,
    B.name,
    C.block_size
  ) D
WHERE A.tablespace_name = D.name
and a.tablespace_name='TEMP5M'
AND A.inst_id           =D.inst_id
GROUP BY a.inst_id,
  A.tablespace_name,
  D.mb_total
ORDER BY 1,2;

 INST_ID TABLESPACE   MB_TOTAL    MB_USED    MB_FREE
-------- ---------- ---------- ---------- ----------
       1 TEMP5M              5          0          5
Run sql on first sql*plus session
SQL> select wm_concat('abc'||'def') as X from dual;

X
--------
abcdef
Temporary segment usage
 INST_ID TABLESPACE   MB_TOTAL    MB_USED    MB_FREE
-------- ---------- ---------- ---------- ----------
       1 TEMP5M              5          1          4
The above sql could be run on the same sql*plus session multiple times without increasing the temporary segment usage. As soon as the sql is run from a different sql*plus session the temporary segment usage increase. After running on second sql*plus session
 INST_ID TABLESPACE   MB_TOTAL    MB_USED    MB_FREE
-------- ---------- ---------- ---------- ----------
       1 TEMP5M              5          2         3
Similarly 4 sql*plus session could be used to run the above mentioned sql and at the end of running the sql on 4th sqlplus session temporary segment usage will be
 INST_ID TABLESPACE   MB_TOTAL    MB_USED    MB_FREE
-------- ---------- ---------- ---------- ----------
       1 TEMP5M              5          4          1
Running on the 5th sql*plus session will result in the following error
SQL> select wm_concat('abc'||'def') as X from dual;
ERROR:
ORA-01652: unable to extend temp segment by 128 in tablespace TEMP5M
ORA-06512: at "WMSYS.WM_CONCAT_IMPL", line 31
Use the following sql to verify the sql statments using temporary segments (SQL from 317441.1)
SQL> SELECT a.username,
  2    a.sid,
  3    a.serial#,
  4    a.osuser,
  5    b.tablespace,
  6    b.blocks,
  7    b.segtype,
  8    c.sql_text
  9  FROM v$session a,
 10    v$tempseg_usage b,
 11    v$sqlarea c
 12  WHERE a.saddr    = b.session_addr
 13  AND c.address    = a.sql_address
 14  AND c.hash_value = a.sql_hash_value
 15  ORDER BY b.tablespace,
 16    b.blocks;

USERNA        SID    SERIAL# OSUSER TABLESPACE     BLOCKS SEGTYPE   SQL_TEXT
------ ---------- ---------- ------ ---------- ---------- --------- --------------------------------------------------
ASANGA          9       6487 oracle TEMP5M            128 LOB_DATA  select wm_concat('abc'||'def') as X from dual
ASANGA        202        209 oracle TEMP5M            128 LOB_DATA   select wm_concat('abc'||'def') as X from dual
ASANGA        136       3207 oracle TEMP5M            128 LOB_DATA  select wm_concat('abc'||'def') as X from dual
ASANGA         78        377 oracle TEMP5M            128 LOB_DATA  select wm_concat('abc'||'def') as X from dual
The temporary segments will not be released until sql*plus session is close (by exit). Even though the table has no LOB column the temp segments are shown as LOB_DATA. Lobs are used within the wm_concat it is why the temp segment appear as LOB_DATA.


On java applications that use connection pool, (UCP or DataSource) connections are not physically closed when the connection close function is called. The close is a logical close and physical connection remains open in the connection pool ready to be used by another session. As such the temporary segment usage will keep on increasing until the temporary tablespace is exhausted since connection pool is never closed except when the application (or application server in most cases) is restarted,dies or crashed.
Following java code could be used to demonstrate how this could affect java application using connection pool. Class imports are not shown
public class DBConnection {

    private PoolDataSource pds = null;

    public DBConnection(String username,String password, String host,String port, String sid) {
        try {
            pds = PoolDataSourceFactory.getPoolDataSource();
            pds.setUser(username);
            pds.setPassword(password);
            pds.setURL("jdbc:oracle:thin:@"+host+":"+port+":"+sid);
            pds.setConnectionFactoryClassName("oracle.jdbc.pool.OracleDataSource");
            pds.setInitialPoolSize(10);
            pds.setMinPoolSize(10);
            pds.setMaxPoolSize(15);
            
        } catch (SQLException ex) {
            ex.printStackTrace();
        }
    }

    public Connection getConnection() throws SQLException {

         return pds.getConnection();
    }
}


public class TempSegThread extends Thread {

    private DBConnection pool;

    public TempSegThread(DBConnection pool) {
        this.pool = pool;
    }

    @Override
    public void run() {
        try {
            for (int i = 0; i < 1; i++) {

                int j = 0;
                Connection con = pool.getConnection();
                PreparedStatement pr = con.prepareStatement("select wm_concat('abc'||'def') from dual");
                ResultSet rs = pr.executeQuery();
                while (rs.next()) {
                    System.out.println(getName() +" "+ rs.getString(1));
                }
                rs.close();
                pr.close();
                con.close();
            }
        } catch (SQLException ex) {
            Logger.getLogger(TempSegThread.class.getName()).log(Level.SEVERE, null, ex);
        }
    }
}


public class ThreadExec {

    public static void main(String[] args) {

//       if (args.length != 5) {
//
//            System.out.println("usage: java ThreadExec username password host port sid");
//        } else {

            try {

                DBConnection pool = new DBConnection("asanga","asa","192.168.0.66","1521","ent11g2");
                //Change to TempSegThread[4] to run without unable to extend temp segment
                TempSegThread [] tempThread = new TempSegThread[5];

                for(int i = 0 ; i < tempThread.length; i++){
                    tempThread[i] = new TempSegThread(pool);
                    tempThread[i].start();
                }

                for(TempSegThread t : tempThread){
                    t.join();
                }
                Thread.sleep(5 * 60 * 60 * 1000);

            } catch (Exception ex) {
                ex.printStackTrace();
            }
//        }
    }
}
Program run the same sql used in sql*plus test and run it in 5 threads using 5 different connections. One of the threads will fail with unable to extend temporary segment while others finish without error. Changing the thread count to 4 will allow 4 threads to run without error and at the end of their execution program will sleep for 5 minutes giving enough time to execute temporary segment usage SQL to see that temporary segments are still being held and usage hasn't gone down even after connections are closed.
As mentioned in metalink note 1384829.1 using "60025 trace name context forever" is of no help in this case even though temporary segments are classified as lob_data.
Oracle has mentioned in many documents that wm_concat "is not meant to be used by customers directly, and could be changed/updated without notice by Oracle Development. Do not use the WMSYS.WM_CONCAT view in your application" (From 1300595.1). So the best case is not to use it as if there's any issue it's unlikely this would qualify for support. (When a SR was raised this is exactly what Oracle said, wm_concat is not for end users).
The other versions tested 11gR1 (11.1.0.7.12) didn't have this issue but 10gR2 (10.2.0.5.8) did have the same issue as 11gR2.

Useful metalink notes
WMSYS.WM_CONCAT Should Not Be Used For Customer Applications, It Is An Internal Function [ID 1336219.1]
Problem with WMSYS.WM_CONCAT Function after Upgrading [ID 1300595.1]
SQL USING WM_CONCAT RUNS SLOWER IN 10.2.0.5 and 11.2.0.2 [ID 1393596.1]
How to Release Temporary LOB Segments without Closing the JDBC Connection [ID 1384829.1]
How to Release the Temp LOB Space and Avoid Hitting ORA-1652 [ID 802897.1]

Related Post
WM_CONCAT Removed From 12c Workspace Manager?

Monday, October 1, 2012

Log File Sync and Log File Parallel Write Waits

On a routine database health check it was discovered the that the log file sync waits events have increased 10-20 times than the base line values. Graphs generated with ADMon
Log File Sync Waits
Metalink note 34592.1 list the usual causes for high log file sync waits. Assuming all else (meaning hardware related parameters) remains the same and constants the main reason is the high user commit rates. But in this case the user commit rates were consistent throughout the problem period.
User Commits
Same was true for user rollbacks (the drop is due to instance restart) and transaction rollbacks.
User Rollbacks
Transaction Rollback
DB CPU was also consistent throughout the problem period which indicated there was no increase in the system load.
DB CPU
Metalink note 34592.1 show how the log file sync wait time could be broken down. Two metrics related to this case are the "redo write time" metric which is the time taken by lgwr process gathering redo to be written and issuing IO. The second statistic is the log file parallel write (refer 34583.1) wait time which is the "time taken for all the outstanding I/O requests to complete". Log file parallel write wait time graph was similar to the log file sync wait time graph.
Log File Parallel Write
34592.1 also mention several statistics that are related to lgwr performance. These are redo writes,redo blocks written,redo write time,redo wastage and redo size. Definitions of these statistics are from the Oracle Reference guide.
Redo writes is the total number of writes by LGWR to the redo log files and it was also consistent throughout the problem period.
Redo Writes
Redo blocks written is the total number of redo blocks written and it is also consistent and hold the same pattern as before the problem period.
Redo Blocks Written
Redo write time is the total elapsed time of the write from the redo log buffer to the current redo log file in 10s of milliseconds. There a increase in redo write time corresponding to log file sync and log file parallel write graphs indicating more time spent on writing redo to log file.
Redo Write Time
Redo wastage is the number of bytes wasted because redo blocks needed to be written before they are completely full. This statistic was also also consistent throughout the problem period.
Redo Wastage
The last is the redo size which is the total amount of redo generated in bytes. This statistics is also an indication of the system load, if there are more DML then amount of redo size would also increase. However in this case the redo size is also consistent throughout the problem period.
Redo Size
Next redo log space and redo log buffer statistics were looked at. Redo buffer allocation retries are the total number of retries necessary to allocate space in the redo buffer. Though there hasn't been any reduction in redo buffer size the amount of redo buffer allocation retries have gone up indicating lgwr is not clearing the log buffer fast enough.
Redo Buffer Allocation Retries
Redo Buffers
Increase in redo log space requests and redo log space wait time indicates that the wait is not memory bound but IO (or disk) bound. Redo log space requests is the number of times the active log file is full and Oracle must wait for disk space to be allocated for the redo log entries. Increase in this means slow IO. Redo log space wait time is the total elapsed waiting time for "redo log space requests" in 10s of milliseconds. Both these statistics have increased in value over the problem period.
Redo Log Space Requests
Redo Log Space Wait Time
Considering all of the above it seems the traffic is the same (system load, redo generation) but some of the lanes (IO bandwidth) "might be" closed. Looking at the IO time on the OS statistics showed a increased IO time which corresponds with log file sync increase. So the assumption that "all else remains the same" may not be correct.
IO Time
Therefore at this point assumption was that cause for increase in these wait events might be due to "something" in the IO sub system rather than the database or application. The system was a two node cluster using a SAN with RAID 5. Both 34583.1 and 34592.1 advise against placing redo logs on RAID 5 but in this case even with RAID 5 the log file *(sync/parallel write) waits were low. So RAID 5 is not the root cause here.
On further investigating the SAN it was discovered the it had two RAID controllers and battery on one of them has failed (thus closing one lane of traffic).
Replacing the dead battery and bringing the RAID controller online (opening up the lane) made the waits to go back to previous level.
Log File Parallel Write



Below is a different system's log file *(sync/parallel write) waits and in this case the increase in waits is caused by a burst of user commits.
Log File Sync
Log File Parallel Write
User Commits
Increase in these two wait events could happen due to user activity as well as due to change in IO related hardware (partial failure as in this case). Having a base line taken when the performance is optimal, to compare with the problem period could help in diagnosing and identifying the reason for the increase.

Useful metalink notes
WAITEVENT: "log file sync" Reference Note [ID 34592.1]
WAITEVENT: "log file parallel write" Reference Note [ID 34583.1]

Thursday, September 20, 2012

Calling Web Service with wget

Wget - "The non-interactive network downloader." (from the man page) could be used to call web service if a request XML is available. This is useful to quickly test a web service.
1. First step is to capture the SOAP request in XML format. Wireshark could be used for this. Below is a sample of a request which is used to test the failover setup for application server connection when RAC is used. The request will send the client's machine name as the input and will respond with the instance connected at the back end.
<?xml version='1.0' encoding='UTF-8'?>
  <S:Envelope xmlns:S="http://schemas.xmlsoap.org/soap/envelope/">
    <S:Body>
      <ns2:getAppAndDBServerNames xmlns:ns2="http://asanga/">
        <arg0>pc120</arg0>
      </ns2:getAppAndDBServerNames>
    </S:Body>
</S:Envelope>
2. Save the XML request in a file. This file will be used as the input to the wget.
3. Execute the wget command as shown below
wget http://192.168.0.66:8080/FailOverTest/FailOverTestPort --post-file=input.xml 
--header="Content-Type: text/xml" --output-document=soapResponse.xml
input.xml is the file with xml request created in step 2 above. Response from the WS will be saved into soapResponse.xml. 192.168.0.66:8080 is the IP and port of the server where web service is deployed and /FailOverTest/FailOverTestPort is the service called.
$ wget http://192.168.0.66:8080/FailOverTest/FailOverTestPort --post-file=input.xml --header="Content-Type: text/xml" --output-document=soapResponse.xml
--2012-09-20 11:47:08--  http://192.168.0.66:8080/FailOverTest/FailOverTestPort
Connecting to 192.168.0.66:8080... connected.
HTTP request sent, awaiting response... 200 OK
Length: 331 [text/xml]
Saving to: `soapResponse.xml'

100%[============================================================================================================>] 331         --.-K/s   in 0s

2012-09-20 11:47:08 (35.1 MB/s) - `soapResponse.xml' saved [331/331]
4. The soapResponse.xml will have the output in XML format.
<?xml version="1.0" ?>
  <S:Envelopexmlns:S="http://schemas.xmlsoap.org/soap/envelope/">
    <S:Body>
      <ns2:getAppAndDBServerNamesResponse xmlns:ns2="http://asanga/">
        <return>
          Connected to ent11g2 DB Server hpc1.domain.net Application Server on Thu Sep 20 11:34:02 BST 2012
        </return>
      </ns2:getAppAndDBServerNamesResponse>
    </S:Body>
  </S:Envelope>

Wednesday, September 19, 2012

enq: TX row lock contention During Inserts due to Bitmap Indexes

Bitmap indexes are not suited on tables that have frequent DML operations. It was shown on an earlier post how the bitmap index locking happens.
enq: TX row lock contention waits happen when multiple sessions try to modify the same row, but could manifest in other areas as well such as inserts on a primary key column and inserts on bitmap index columns. Since insert is effectively creating a new row seeing enq: TX row lock contention on inserts could be bit puzzling.
In the case of inserting into a primary key column subsequent sessions will wait on enq: TX row lock contention event until first session commit or rollback. This is required to identify and raise ORA-00001: unique constraint in cases where duplicate values are being inserted into primary key column. Therefore it is not uncommon to see enq: TX row lock contention waits on inserts when there's a primary key column and a highly concurrent insert rate. To reduce this wait if possible remove the primary key but may not always possible due to application logic.
Second place where enq: TX row lock contention events comes up is when inserting to table with bitmap indexes. enq: TX row lock contention waits happen since bitmap index entries must be updated based on the inserted rows. This has a far worse performance impact than insert on a primary key. Below is a simple test to illustrate the point.
Create a table with primary key
create table abc (id number, a varchar2(1), b number(1), c varchar2(1), primary key (id));
Create bitmap indexes as below
SQL> create bitmap index aidxa on abc(a);
Index created.

SQL> create bitmap index aidxb on abc(b);
Index created.

SQL> create bitmap index aidxc on abc(c);
Index created.
Open two sqlplus sessions and run the following two PL/SQL blocks. On session 1
begin
    for i in 1 .. 100000
    loop
      insert into abc values(i,dbms_random.string('i',1),round(dbms_random.value(1,5)),dbms_random.string('i',1));
      commit;
    end loop;
end;
/
On session 2
begin
    for i in 100000 .. 200000
    loop
      insert into abc values(i,dbms_random.string('i',1),round(dbms_random.value(1,5)),dbms_random.string('i',1));
      commit;
    end loop;
end;
/
Monitor the waits either through em console or active session history view. In this case APConsole has been used and output is given at the end.


Drop the bitmap indexes and truncate the table and run the test again. The image below shows the comparison of the two test cases.

It is apparent that amount of cpu time is high during the test with bitmap index in-place but the point is the enq: TX row lock contention which is none when test was run without bitmap index. Therefore when there are high enq: TX row lock contention waits seen for insert statements it is worth investigating whether the table in question has bitmap indexes.

Related post
Bitmap Index Locking

Metalink Notes
TX Transaction and Enq: Tx - Row Lock Contention - Example wait scenarios [ID 62354.1]

Monday, September 17, 2012

PCTFREE and Rows per Block

For tablespaces with ASSM only the pctfree parameter would have any affect on the data storage. From Oracle database concept document "The PCTFREE storage parameter is essential to how the database manages free space. This SQL parameter sets the minimum percentage of a data reserved as free space for updates to existing rows. Thus, PCTFREE is important for preventing row migration and avoiding wasted space."
Amount of rows that could be stored in a block is affected by the value set for pctfree. If the pctfree is high the row count in a block goes down (useful in reducing hot blocks) and when pctfree is low the row count inside the block goes high (useful for full table scans). Size of the block and the length of the row also plays a part in the row count inside the block.
Following is a simple test to illustrate this point. Two tables are created identical in all aspects except the pctfree value. Both reside in a tablespace which has a block size of 2K.
create table pct0 (id varchar2(100)) tablespace tbs2k pctfree 0;
create table pct99 (id varchar2(100)) tablespace tbs2k pctfree 99;

SQL> select table_name,pct_free from user_tables where table_name in ('PCT0','PCT99');

TABLE_NAME                       PCT_FREE
------------------------------ ----------
PCT99                                  99
PCT0                                    0
Following PL/SQL code block will insert 100 rows to each table
declare
x varchar2(40);
begin
  for i in 1 .. 100
    loop
    x := dbms_random.string('x',36);
    insert into pct0 values (x);
    insert into pct99 values (x);
    commit;
  end loop;
end;
/
There are two ways to get the row count inside a block either by using dbms_rowid package or decoding the rowid.
Using dbms_rowid package on pct0 table
SQL> select DBMS_ROWID.ROWID_OBJECT(rowid) "OBJECT",
  2            DBMS_ROWID.ROWID_RELATIVE_FNO(rowid) "FILE",
  3             DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid) "BLOCK",count(*) from pct0
  4             group by DBMS_ROWID.ROWID_OBJECT(rowid), DBMS_ROWID.ROWID_RELATIVE_FNO(rowid), DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid);

    OBJECT       FILE      BLOCK   COUNT(*)
---------- ---------- ---------- ----------
    103409         14     206983        100
which tells that all the rows are in one block. Querying the pct99 table
SQL> select DBMS_ROWID.ROWID_OBJECT(rowid) "OBJECT",
  2            DBMS_ROWID.ROWID_RELATIVE_FNO(rowid) "FILE",
  3             DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid) "BLOCK",count(*) from pct99
  4             group by DBMS_ROWID.ROWID_OBJECT(rowid), DBMS_ROWID.ROWID_RELATIVE_FNO(rowid), DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid);

    OBJECT       FILE      BLOCK   COUNT(*)
---------- ---------- ---------- ----------
    103411         14     206998          1
    103411         14     207007          1
    103411         14     207014          1
    103411         14     207017          1
    103411         14     207041          1
    103411         14     207049          1
..
..
..
    OBJECT       FILE      BLOCK   COUNT(*)
---------- ---------- ---------- ----------
    103411         14     207098          1

100 rows selected.
which tells that each row is stored in single block and 100 blocks are needed to store all the rows.
select rows_per_block,count(*) from (select substr(rowid,10,6) as block,count(*) as rows_per_block 
from pct0 group by substr(rowid,10,6)) group by rows_per_block order by 1;

ROWS_PER_BLOCK   COUNT(*)
-------------- ----------
           100          1


select rows_per_block,count(*) from (select substr(rowid,10,6) as block,count(*) as rows_per_block 
from pct99 group by substr(rowid,10,6)) group by rows_per_block order by 1;

ROWS_PER_BLOCK   COUNT(*)
-------------- ----------
             1        100
As more blocks are used the space usage will also increase compared to pct0 table above.
SQL> select segment_name as table_name, bytes/1024 as "Size KB" from user_segments where segment_name IN ('PCT0','PCT99');

TABLE_NAME    Size KB
---------- ----------
PCT0               64
PCT99             896
Multiple factors must be considered when setting the pctfree value such as the trade-offs between increase in storage vs reduction in IO (in FTS), the nature of access (FTS vs single block access) to the segment.

Tuesday, August 7, 2012

Variable Length In-Clause : SQL vs PL/SQL Method Comparison

When the number of variables in a in-clause varies this results in a new statement being created. The widely available solution (after some googling) for this is to create a PL/SQL function and use a table type to run the in-clause where only one statement is created and will get re-used irrespective of number of parameters. This is a comparison between these two options of using the PL/SQL method and creating unique SQL for each set of in-clause variables.
For PL/SQL option create a table type and function as below
create or replace type mytabletype as table of number;

create or replace function mytablefun(input in varchar2) return
mytabletype
as 
    l_str   LONG DEFAULT input || ',';
    l_n     NUMBER;
    l_data  mytabletype := mytabletype();
BEGIN
    LOOP
        l_n := instr( l_str, ',' );
        exit when (nvl(l_n,0) = 0);
        l_data.extend;
        l_data( l_data.count ) := ltrim(rtrim(substr(l_str,1,l_n-1)));
        l_str := substr( l_str, l_n+1 );
    END LOOP;
    RETURN l_data;
END;
/
The test case is run using a java program which generate the sql statments. Code is given below
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;

public class InclauseTest {

    public static void main(String[] args)  {
        try {

            //Connection pool creation code eg. OracleDataSource or UDP

            Connection con = pool.getConnection();

            long t1 = System.currentTimeMillis();

            newsqlMehtod(con);       //runs sql method test
//            newfunctionMethod(con);  // uncomment to run pl/sql method test

            long t2 = System.currentTimeMillis();

            DBStats.displayStats(con);


            System.out.println("\n\n total time taken : "+(t2-t1));
            con.close();

        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }


 public static void newsqlMehtod(Connection con){

        String sql1 = "select * from x where a in (";
                        
        for (int i = 0 ; i < 10000; i++){
            
            try {
                int k = i % 10;
                PreparedStatement pr = con.prepareStatement(sql1+prepareStringCreate(k)+")");

                preparePopulate(pr, k+1);


                ResultSet rs = pr.executeQuery();

                while(rs.next()){

                    int q = rs.getInt(1);
                    double p = rs.getDouble(2);
                }

                rs.close();
                pr.close();


            } catch (Exception ex) {
                ex.printStackTrace();
            }
        }
    }

  public static String prepareStringCreate( int i){
        
        StringBuilder x = new StringBuilder("?");

        for(int j = 0; j< i; j++){
            
            x.append(",?");
        }
        
        return x.toString();
        
    }

  public static void preparePopulate(PreparedStatement pr, int i) throws SQLException{

        for(int j = 0; j< i; j++){

            pr.setInt(j+1, (j+1)* 10);
        }

    }

 public static void newfunctionMethod(Connection con){

             String sql = "select * from x where a in (select * from table(SELECT CAST( mytablefun( ? ) AS mytabletype ) FROM dual))";

        for (int i = 0 ; i < 10000; i++){
            try {
                PreparedStatement pr = con.prepareStatement(sql);

               StringBuilder x = new StringBuilder("10");

               int q = i%10;

               for (int j = 0; j<q  ; j++){

                   x.append(","+((j+2)*10));
               }

                
                pr.setString(1, x.toString());
                ResultSet rs = pr.executeQuery();

                while(rs.next()){

                    int k = rs.getInt(1);
                    double p = rs.getDouble(2);
                }

                rs.close();
                pr.close();

            } catch (Exception ex) {
                ex.printStackTrace();
            }
        }

    }
}

import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;

public class DBStats {

    public static void displayStats(Connection con) {
        try {
            PreparedStatement pr = con.prepareStatement("select name,value " + "from v$mystat,v$statname " + "where v$mystat.statistic#=v$statname.statistic# " 
//                    + "and v$statname.statistic# in (11,12)"); //11gR1
                    + "and v$statname.statistic# in (16,17)"); //11gR2

            ResultSet rs = pr.executeQuery();


            while(rs.next()){

                System.out.println(rs.getString(1)+" "+rs.getDouble(2));
            }

            rs.close();
            pr.close();
        } catch (SQLException ex) {
            ex.printStackTrace();
        }

    }
}
Running the test case once will result in 10000 sql statements being executed with number of values used in the in-clause varying from 1 - 10. The DBStats class is used to get CPU used by the session values. The java test program is run twice (resulting in 20000 executions) and SQL statistics are taken. Shared pool is flushed before the run but not between each run (so the statistics are cumulative).


First the results from SQL method. The session stats gives
first run
CPU used when call started 47.0
CPU used by this session 49.0

second run
CPU used when call started 43.0
CPU used by this session 43.0
SQL stats
SQL> select sql_id,cpu_time,elapsed_time,executions,SHARABLE_MEM,PERSISTENT_MEM,RUNTIME_MEM,
sql_text from v$sql where sql_text like 'select * from x where a in%' order by sql_text;

SQL_ID          CPU_TIME ELAPSED_TIME EXECUTIONS SHARABLE_MEM PERSISTENT_MEM RUNTIME_MEM SQL_TEXT
------------- ---------- ------------ ---------- ------------ -------------- ----------- ----------------------------------------------------------------------
26fb47dhnuz88      28994        94332       2000        19753           4496        3408 select * from x where a in (:1 )
068anap1mvbdx      46993       105790       2000        15701           4648        3528 select * from x where a in (:1 ,:2 )
387t0d508fkzb      55993       107300       2000        19777           4728        3576 select * from x where a in (:1 ,:2 ,:3 )
3hm2vcm9q6zup      35994       108100       2000        19797           4808        3624 select * from x where a in (:1 ,:2 ,:3 ,:4 )
7fg42cswna64p      55992       107610       2000        19825           4888        3672 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 )
89k5ysfx65zpu      52992       109258       2000        19845           4968        3720 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 )
32w3mnqx0vhf4      40994       109297       2000        19865           5048        3768 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 )
4yxvar786t71u      63987       110639       2000        19885           5128        3816 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 ,:8 )
7x0jrn3nj8ux6      54987       112536       2000        19913           5208        3864 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 ,:8 ,:9 )
0z3udxfqg2p0p      57998       112400       2000        19942           5288        3912 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 ,:8 ,:9 ,:10 )

10 rows selected.
As expected the SQL method results in 10 statements being created and each getting equal number of executions. Stats for each executions, although elapsed time per exec is roughly same CPU time values vary.
SQL> SELECT sql_id,
  2    cpu_time,
  3    round(cpu_time/executions,2) as cpu_per_exec,
  4    elapsed_time,
  5    round(elapsed_time/executions,2) as ela_per_exec,
  6    executions,
  7  --  SHARABLE_MEM,
  8  --    PERSISTENT_MEM,
  9  --  RUNTIME_MEM,
 10    sql_text
 11  FROM v$sql
 12  WHERE sql_text LIKE 'select * from x where a in%'
 13  ORDER BY sql_text;

SQL_ID        CPU_TIME CPU_PER_EXEC ELAPSED_TIME ELA_PER_EXEC EXECUTIONS SQL_TEXT
------------- -------- ------------ ------------ ------------ ---------- ---------------------------------------------------------------------
26fb47dhnuz88    28994         14.5        94332        47.17       2000 select * from x where a in (:1 )
068anap1mvbdx    46993         23.5       105790         52.9       2000 select * from x where a in (:1 ,:2 )
387t0d508fkzb    55993           28       107300        53.65       2000 select * from x where a in (:1 ,:2 ,:3 )
3hm2vcm9q6zup    35994           18       108100        54.05       2000 select * from x where a in (:1 ,:2 ,:3 ,:4 )
7fg42cswna64p    55992           28       107610        53.81       2000 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 )
89k5ysfx65zpu    52992         26.5       109258        54.63       2000 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 )
32w3mnqx0vhf4    40994         20.5       109297        54.65       2000 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 )
4yxvar786t71u    63987        31.99       110639        55.32       2000 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 ,:8 )
7x0jrn3nj8ux6    54987        27.49       112536        56.27       2000 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 ,:8 ,:9 )
0z3udxfqg2p0p    57998           29       112400         56.2       2000 select * from x where a in (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 ,:8 ,:9 ,:10 )
Next is the PL/SQL method, again java program is run twice and shared pool is flushed before the run but not between the runs. The session stats gives
first run
CPU used when call started 479.0
CPU used by this session 481.0

second run
CPU used when call started 267.0
CPU used by this session 267.0
CPU used by session is high compared to the SQL statement method earlier. The sql stats
SQL_ID          CPU_TIME ELAPSED_TIME EXECUTIONS SHARABLE_MEM PERSISTENT_MEM RUNTIME_MEM SQL_TEXT
------------- ---------- ------------ ---------- ------------ -------------- ----------- ----------------------------------------------------------------------
1p40d61qp4u6g    6333007      7284060      19997        28748           7040        5328 select * from x where a in (select * from table(SELECT CAST( mytablefu
                                                                                         n( :1  ) AS mytabletype ) FROM dual))

1p40d61qp4u6g      91986        91943          3        28748           7040        5328 select * from x where a in (select * from table(SELECT CAST( mytablefu
                                                                                         n( :1  ) AS mytabletype ) FROM dual))
Per exec values
SQL_ID            CPU_TIME CPU_PER_EXEC ELAPSED_TIME ELA_PER_EXEC EXECUTIONS SQL_TEXT
------------- ------------ ------------ ------------ ------------ ---------- ---------------------------------------------------------------------
1p40d61qp4u6g      6333007        316.7      7284060       364.26      19997 select * from x where a in (select * from table(SELECT CAST( mytablef
                                                                             un( :1  ) AS mytabletype ) FROM dual))

1p40d61qp4u6g        91986        30662        91943     30647.67          3 select * from x where a in (select * from table(SELECT CAST( mytablef
                                                                             un( :1  ) AS mytabletype ) FROM dual))
First some clarification as to why there are two sets of statments when only one is expected. Looking at the sql shared view
SQL> SELECT sql_id,
  2  child_number,
  3  address,
  4  executions,sql_text
  5  FROM v$sql
  6  WHERE sql_text LIKE 'select * from x where a in%'
  7  ORDER BY sql_text;

SQL_ID        CHILD_NUMBER ADDRESS          EXECUTIONS SQL_TEXT
------------- ------------ ---------------- ---------- ----------------------------------------------------------------------
1p40d61qp4u6g            1 00000002088B14D0      19997 select * from x where a in (select * from table(SELECT CAST( mytablefu
                                                       n( :1  ) AS mytabletype ) FROM dual))

1p40d61qp4u6g            0 00000002088B14D0          3 select * from x where a in (select * from table(SELECT CAST( mytablefu
                                                       n( :1  ) AS mytabletype ) FROM dual))
                
                
SQL> select sql_id,address,child_number,use_feedback_stats from V$SQL_SHARED_CURSOR where sql_id='1p40d61qp4u6g'  and address='00000002088B14D0';

SQL_ID        ADDRESS          CHILD_NUMBER USE_FEEDBACK_STATS
------------- ---------------- ------------ --------------------
1p40d61qp4u6g 00000002088B14D0            0 Y
1p40d61qp4u6g 00000002088B14D0            1 N
"Y" on use_feedback_stats means "A hard parse is forced so that the optimizer can reoptimize the query with improved cardinality estimates" (From Oracle docs).

The comparison is made on cpu time and shareable memory. First the cpu time comparison
Sum of cpu time from the SQL method test = 494924
Sum of cpu time from the PL/SQL method test = 6424993
It could be seen that PL/SQL method uses over 1198% more CPU time than the SQL method.

Comparing the shareable memory used by two methods
Sum of shareable memory from SQL method test = 194303
Sum of shareable memory from PL/SQL method test = 57496
It could be seen due to creation of multiple statements SQL method uses 238% more shareable memory than the PL/SQL method.

The comparison shows the two methods have trade-off between cpu and memory.