Friday, March 28, 2008

Blockrecovery using RMAN

Use dbv (db_verify) to find out if there are corrupted blocks.

find out file# and block# with


select * from v$database_block_corruption;

This view displays blocks marked corrupt by the most recent back with or without the VALIDATE option.

BACKUP VALIDATE DATABASE;

use rman to recover the block/s with


blockrecover datafile file# block block#

or

blockrecover corruption list;


Above is valid for Oracle 10g.

Oracle 11g is replacing the blockrecover command with recover command

recover datafile file# block block#

this searches for uncorrupted blocks in flashback logs before searching for the blocks in full or level 0 incremental backups.Block media recovery can only use redo logs for media recovery, not level 1 incremental backups.



Thursday, March 27, 2008

Archive Gaps due to Firewall Connection Timeouts

Shipment of log files between primary and standby data-guard nodes, intermittently fails withORA-12570 TNS:packet reader failure, when the connection is going via a firewall.

Firewall is removing the network connection from under the Oracle Net layer. If the network connection has no connectivity and a timeout filter is enabled, then the connection can be lost.

Set the SQLNET.EXPIRE_TIME=X parameter in the SQLNET.ORA file on both servers.This enables Dead Connection Detection (DCD). When enabled, this sends a probe packet from the server (Standby node) to the client(Primary node) every X about of minutes, for the value of X set in the SQLNET.ORA file.

Set on both servers, so any change such as switchover between nodes, ensures the parameter is set for both directions.


More on metalink note 550103.1 , 151972.1 and 257650.1

WARNING: inbound connection timed out (ORA-3136)

The "WARNING: inbound connection timed out (ORA-3136)" in the alert log indicates that the client was not able to complete it's authentication within the period of time specified by parameter SQLNET.INBOUND_CONNECT_TIMEOUT.

You may also witness ORA-12170 without timeout error on the database server sqlnet.log file.
This entry would also have the clinet address which failed to get authenticated. Some applications or JDBC thin driver applications may not have these details.

From 10.2 onwards the default value of this parameter is 60 seconds, hence if the client is not able authenticate within 60 secs , the warning would appear in the alert log and the client connection will be terminated.

This timeout restriction was introduced to combat Denial of Service (DoS) attack whereby malicious clients attempt to flood database servers with connect requests that consumes resources.

There can be three main reasons for this error

  1. Server gets a connection request from a malicious client which is not supposed to connect to the database , in which case the error thrown is the correct behavior. You can get the client address for which the error was thrown via sqlnet log file.
  2. The server receives a valid client connection request but the client takes a long time to authenticate more than the default 60 seconds.
  3. The DB server is heavily loaded due to which it cannot finish the client logon within the timeout specified.

The default value of 60 seconds is good enough in most conditions for the database server to authenticate a client connection. If its taking longer period, then its worth checking all the below points before going for the workadound:

1. Check whether local connection on the database server is sucessful & quick.

2. If local connections are quick ,then check for underlying network delay with the help of your network administrator.

3. Check whether your Database performance has degraded by anyway.

4. Check alert log for any critical errors for eg, ORA-600 or ORA-7445 and get them resolved first.

These critical errors might have triggered the slowness of the database server.

As a workaround to avoid only this warning messages, you can set the parameters SQLNET.INBOUND_CONNECT_TIMEOUT and INBOUND_CONNECT_TIMEOUT_listenername
to the value more than 60.


In server side sqlnet.ora file add SQLNET.INBOUND_CONNECT_TIMEOUT

SQLNET.INBOUND_CONNECT_TIMEOUT = 120
In listener.ora file INBOUND_CONNECT_TIMEOUT_listenername

INBOUND_CONNECT_TIMEOUT_LISTENER = 110

From Oracle version 10.2.0.3 onwards the default value of INBOUND_CONNECT_TIMEOUT_ is 60 seconds. For previous releases it is zero by default.

More on meta link note 465043.1 and 345197.1




Shared Server Status


The STATUS column of the V$SHARED_SERVER column gives useful informationabout WAIT status.

WAIT(ENQ) status tells the DBA that the user is waiting for a lockresource, and in rare cases, acts as an alert for a deadlock situation.

WAIT(COMMON) indicated the shared servers running on this instance arewaiting for requests to process. Hence, this servers is not busy.

WAIT(RECEIVE) indicates the shared server becomes dedicated


Wednesday, March 26, 2008

Data Guard Standby with OMF or ASM

OMF

1. Enable forced logging on the primary database.

2. Enable archiving on the primary database.

3. Set all necessary initialization parameters on the primary database.

4. Create an initialization parameter file for the standby database.

5. If the primary database is configured to use OMF, then Oracle recommends that the standby database be configured to use OMF, too. To do this, set the DB_CREATE_FILE_DEST and DB_CREATE_ONLINE_LOG_DEST_n initialization parameters to appropriate values. Maintenance and future role transitions are simplified if the same disk group names are used for both the primary and standby databases.

6. Set the STANDBY_FILE_MANAGEMENT initialization parameter to AUTO.

7. Configure Oracle Net, as required, to allow connections to the standby database.

8. Create a remote login password file for the standby database. Use the same
password for the SYS account as on the primary database.

9. Start the standby database instance without mounting the control file.

ASM


1. If the standby database is going to use ASM, create an ASM instance if one does
not already exist on the standby database system.

2. Use the RMAN BACKUP command to create a backup set that contains a copy of
the primary database’s data files, archived log files, and a standby control file.

3. Use the RMAN DUPLICATE … FOR STANDBY command to copy the data files,
archived redo log files and standby control file in the backup set to the standby
database’s storage area.

The DUPLICATE … FOR STANDBY command performs the actual data movement
at the standby instance. If the backup set is on tape, the media manager must be
configured so that the standby instance can read the backup set. If the backup set
is on disk, the backup pieces must be readable by the standby instance, either by
making their primary path names available through NFS, or by copying them to
the standby system and using RMAN CATALOG BACKUPPIECE command to
catalog the backup pieces before restoring them.


Restrict / as sysdba

add

SQLNET.AUTHENTICATION_SERVICES=(NONE)

to sqlnet.ora

Tuesday, March 18, 2008

Shared Memory & Semaphores Used by Oracle Instance

$ORACLE_HOME/bin/sysresv
IPC Resources for ORACLE_SID "tbxdb1" :
Shared Memory:
ID KEY
32769 0x90bff314
Semaphores:
ID KEY
229377 0x7224b954
Oracle Instance alive for sid "tbxdb1"

from OS


ipcs -m

------ Shared Memory Segments --------
key shmid owner perms bytes nattch status

0x90bff314 32769 oracle 600 1075838976 34


ipcs -s

------ Semaphore Arrays --------
key semid owner perms nsems
0x7224b954 229377 oracle 660 1004


Find out Interconnect Protocol

SQL > oradebug setmypid
Statement processed.

SQL>oradebug ipc
Information written to trace file.


Trace file can be found in user_dump_dest directory
SSKGXPT 0x66964c0 flags SSKGXPT_READPENDING socket no 7 IP 10.1.10.22 UDP 52459


This method can be used for both RDMS and ASM instances.

Alert logs of these various types of instances also can be used to find out the interconnect protocol.
cluster interconnect IPC version:Oracle UDP/IP (generic)


Database Links

As sys or system

create public database link link_name using 'tns_entry_name';

As the user using the link

create database link public_link_name connect to remote_db_username identified by remote_db_pw;

Wednesday, March 12, 2008

Data File deleted using OS utility

If backups are available then

mount the database and
ALTER DATABASE CREATE DATAFILE old datafile name as new datafile name;
or
ALTER DATABASE CREATE DATAFILE datafile# as new;

recover datafile datfile name (or datafile#);
open the database

If backups are not available then remove the datafile entry so that database can be opened.

mount the database and

alter database datafile datafile# offline drop;
or
alter database datafile datafile name offline drop;

open the database and drop the tablespace if no longer needed.

Wednesday, March 5, 2008

Dispatcher Information on Listener




OS Watcher

Download information is available on metlink note 301137.1

Data collectors exist for top, vmstat, iostat, mpstat, netstat, ps and an optional collector for tracing private networks. To turn on data collection for private networks the user must create an executable file in the osw directory named private.net. An example of what this file should look like is named Example private.net in the osw directory. This file can be edited and renamed private.net or a new file named private.net can be created. This file contains entries for running the traceroute command to verify RAC private networks.

Start OSW

startOSW.sh interval_in_sec number_of_hours_to_archive_the_data
if not arguments are present then default will be every 10 seconds and 48 hours of data will be archived.
It's a moving window of 48 hours, meaning on the 49th hour the 1st hour's data will be deleted leaving 48 hours worth of data on the system.

Use nohup ./startOSW.sh 60 10 & to start OSW in background.

Stop OSW

to stop the OSW use stopOSW.sh

OSW Output

OSW output is available in OSW_HOME/archive directory. It will have a seperate subdirecotry for each Os utility. Output file will have the following format for its name node_name_OS_utility_YY.MM.DD.HH24.dat

oswiostat

Field Description
r/s :- Shows the number of reads/second
w/s :- Shows the number of writes/second
kr/s :- Shows the number of kilobytes read/second
kw/s :- Shows the number of kilobytes written/second
wait :- Average number of transactions waiting for service (queue length)
actv :- Average number of transactions actively being serviced
wsvc_t :- Average service time in wait queue, in milliseconds
asvc_t :- Average service time of active transactions, in milliseconds
%w :- Percent of time there are transactions waiting for service
%b :- Percent of time the disk is busy device Device name

what to look for

  1. Average service times greater than 20msec for long duration.
  2. High average wait times.


oswmpstat

Field Description
cpu Processor ID
minf Minor faults
mif Major Faults
xcal Processor cross-calls (when one CPU wakes up another by interrupting it).
intr Interrupts
ithr Interrupts as threads (except clock)
csw Context switches
icsw Involuntary context switches
migr Thread migrations to another processor
smtx Number of times a CPU failed to obtain a mutex
srw Number of times a CPU failed to obtain a read/write lock on the first try
syscl Number of system calls
usr Percentage of CPU cycles spent on user processes
sys Percentage of CPU cycles spent on system processes
wt Percentage of CPU cycles spent waiting on event
idl Percentage of unused CPU cycles or idle time when the CPU is basically doing nothing

what to look for

  1. Involuntary context switches (this is probably the more relevant statistic when examining performance issues.)
  2. Number of times a CPU failed to obtain a mutex. Values consistently greater than 200 per CPU causes system time to increase.
  3. xcal is very important, show processor migration

OSW graph comes bundled with OSW v2.0.0 and higher. Document is available on metalink note 461053.1

It requires java version 1.4 or higher. to use simply run

java -jar -Xmx512M OSWg.jar -i /home/osw/archive


where /home/osw/archive is the path to the direcotry containing the output subdirectories.


For RAC environment to monitor private network copy the provided Exampleprivate.net as private.net into the same directory. Remove all commands in the private.net except for the ones that are matching the current OS. Replace the node names in the private.net with actual node names of the system.

Monday, March 3, 2008

Auto Start Oracle

Start and shut of oracle database when machine boot up and shutdown.

create a script name oracle with the below text in /etc/init.d as root.


#chkconfig: 2345 99 1
#description: testing script
#
#!/bin/sh

user=oracle
SU=/bin/su
export ORACLE_HOME=/opt/app/oracle/product/10.2.0/db_1

case "$1" in
start)
touch /var/lock/subsys/oracle
$SU $user $ORACLE_HOME/bin/dbstart
;;
restart)
$SU $user $ORACLE_HOME/bin/dbshut
rm -f /var/lock/subsys/oracle
touch /var/lock/subsys/oracle
$SU $user $ORACLE_HOME/bin/dbstart
;;
stop)
$SU $user $ORACLE_HOME/bin/dbshut
rm -f /var/lock/subsys/oracle
;;
*)
echo "Usage : oracle start restart stop"
;;
esac

#chkconfig: 2345 99 1 tells the start portion will run in runlevels of 2,3,4 and 5 and start value is 1 (S01) and kill value is 99 (K99).

A file in /var/lock/subsys/service_name is needed to run the shutdow part of the script.
/var/lock/subsys/oracle is used for this.

Staying in /etc/init.d type

/sbin/chkconfig --add oracle to add the startup and shutdown scripts to respective runlevels.
/sbin/chkconfig --del oracle to delete the service.
/sbin/chkconfig oracle on/off to disable and enable service start up on boot time.

edit the /etc/oratab file and add Y to end of the line of the instance that needs to startup when machine boots.

ent10:/oraclebase/app/oracle/product/10.2.0/db_1:Y

Current dbstart and dbshut script in $ORACLE_HOME/bin has errors that require you pass the ORACLE_HOME when you call them. To fix this edit
ORACLE_HOME_LISTNER=$1 to ORACLE_HOME_LISTNER=$ORACLE_HOME
on both dbstart and dbshut.

Related Metalink notes

Note:61333.1 - AUTOMATIC STARTUP AND SHUTDOWN OF ORACLE DATABASE INSTANCES
Note:207508.1 - Dbstart does not work if using an spfile only
Note:91815.1 - The Annotated dbstart Script
Note:1019790.6 - AUTOMATIC STARTUP/SHUTDOWN OF ORACLE INSTANCES ON HP-UX UNIX
Note:1019817.6 - AUTOMATIC STARTUP/SHUTDOWN OF ORACLE INSTANCES ON IBM AIX
Note:222813.1 - How to Automate Startup/Shutdown of Oracle Database on Linux