Tuesday, December 24, 2013

SCAN (Single Client Access Name) Set Up Using DNS

Setting up a SCAN allow clients connecting to databases (or instances) in a RAC to use a single name to connect to any database (or instance). It could be considered an alias for all the databases in the cluster. Unlike TNS entries with VIP which need to be modified each time an instance is added or removed, TNS entries with SCAN IP does not need to be modified. SCAN was first introduced with 11gR2 and had additional features added with 12c. For more read the SCAN white paper.
As a prerequisite for RAC, SCAN could be setup either using DNS or GNS (Grid Naming Service). For non-production, test environment a single IP with same hostname could be added to /etc/hosts files of all the RAC nodes to get the installation going but this is not recommended (See 887522.1 Instead of DNS or GNS, Can we use '/etc/hosts' to resolve SCAN?) and prerequisite for SCAN check will fail (887471.1),but the installation will complete.
Setting up the SCAN comes under network administrators job role. It's for DBA to request "at least one single name that resolves to three IP addresses using a round robin algorithm". This post list the steps to setup a SCAN using DNS configuration. It will setup a single client access name that resolve to three IPs in a round robin fashion. Nevertheless it is recommended that a network administrator is consulted when setting up SCAN for a production system.
1. Verify DNS related rpms are installed. It would require following three rpms
# rpm -qa | grep bind
bind-libs-9.8.2-0.17.rc1.el6.x86_64
bind-9.8.2-0.17.rc1.el6.x86_64
bind-utils-9.8.2-0.17.rc1.el6.x86_64
2. Note down the hostname and the IP of the server where the SCAN is setup. Also find out the DNS IP (that is already configured)
# hostname
hpc6.mydomain.net

# ifconfig
em1       Link encap:Ethernet  HWaddr 00:26:B9:FE:7D:E0
          inet addr:192.168.0.104  Bcast:192.168.0.255  Mask:255.255.255.0

cat /etc/resolve.conf
search mydomain.net
nameserver 11.6.9.2
These values will be referred in subsequent steps.
3. Setting up the DNS involves adding IP and ports the DNS service listen on to the /etc/named.conf file. The default /etc/named.conf file looks like as follows
options {
        listen-on port 53 { 127.0.0.1; };
        listen-on-v6 port 53 { ::1; };
        directory       "/var/named";
        dump-file       "/var/named/data/cache_dump.db";
        statistics-file "/var/named/data/named_stats.txt";
        memstatistics-file "/var/named/data/named_mem_stats.txt";
        allow-query     { localhost; };
        recursion yes;

        dnssec-enable yes;
        dnssec-validation yes;
        dnssec-lookaside auto;

        /* Path to ISC DLV key */
        bindkeys-file "/etc/named.iscdlv.key";

        managed-keys-directory "/var/named/dynamic";
};

logging {
        channel default_debug {
                file "data/named.run";
                severity dynamic;
        };
};

zone "." IN {
        type hint;
        file "named.ca";
};

include "/etc/named.rfc1912.zones";
include "/etc/named.root.key";
Add new entries to configure the DNS and the SCAN
options {
        listen-on port 53 { 192.168.0.104; 127.0.0.1; }; #IP of the DNS server (noted on step 2)
#       listen-on-v6 port 53 { ::1; }; # IPv6 disabled
        directory       "/var/named";
        dump-file       "/var/named/data/cache_dump.db";
        statistics-file "/var/named/data/named_stats.txt";
        memstatistics-file "/var/named/data/named_mem_stats.txt";
        allow-query     { localhost; 192.168.0.0/24; };
        recursion yes;
        allow-transfer {"none";};

        forwarders { 11.6.9.2; };  # main DNS IP, anything that cannot be resolved will be forwarded to this IP
};

logging {
        channel default_debug {
                file "data/named.run";
                severity dynamic;
        };
};

zone "mydomain.net" IN {
        type master;
        file "mydomain.net.zone"; # forward lookups entry file
        allow-update { none; };
};


zone "0.168.192.in-addr.arpa" IN {
        type master;
        file "rev.mydomain.net.zone"; # reverse lookup entry file
        allow-update { none; };
};
4. Create forward lookup and reverse lookup files in /var/named. The forward lookup file was named "mydomain.net.zone" for the zone "mydomain.net" in step 3.
cat /var/named/mydomain.net.zone

$TTL 86400
@          IN     SOA    hpc6.mydomain.net.  root.hpc6.mydomain.net. (
                         42 ; serial (d. adams)
                         3H ; refresh
                        15M ; retry
                         1W ; expiry
                         1D ) ; minimum
          IN   NS     hpc6.mydomain.net.
hpc6       IN   A      192.168.0.104
rac-scan        IN      A       192.168.0.86
rac-scan        IN      A       192.168.0.93
rac-scan        IN      A       192.168.0.94
Note the places where the hostname of the server where SCAN is setup being used. Create the reverse lookup file as follows
cat rev.mydomain.net.zone
$ORIGIN 0.168.192.in-addr.arpa.
$TTL 1H
@       IN      SOA     hpc6.mydomain.net.     root.hpc6.mydomain.net. (      2
                                                3H
                                                1H
                                                1W
                                                1H )
0.168.192.in-addr.arpa.         IN NS      hpc6.mydomain.net.

104     IN PTR  hpc6.mydomain.net.
86     IN PTR  rac-scan.mydomain.net.
94     IN PTR  rac-scan.mydomain.net.
93     IN PTR  rac-scan.mydomain.net.
5. Validity of these configuration files could be checked with named-checkconf and named-checkzone.
# named-checkconf /etc/named.conf

# named-checkzone mydomain.net /var/named/mydomain.net.zone
zone mydomain.net/IN: loaded serial 42
OK

# named-checkzone 0.168.192.in-addr.arpa  /var/named/rev.mydomain.net.zone
zone 0.168.192.in-addr.arpa/IN: loaded serial 2
OK
However no error in this step is no guarantee that SCAN will work as expected. It could only be verified with a dig command (shown on a later step).
6. Edit the resolve.conf file and add the new DNS server IP (server IP itself)
cat /etc/resolve.conf 

search mydomain.net
nameserver 192.168.0.104
Also add entry to /etc/hosts
/etc/hosts as below
192.168.0.104   hpc6.mydomain.net        hpc6
7. Start the DNS service
# /etc/init.d/named start
If the first time start of the service get stuck on the following step
# /etc/init.d/named start
Generating /etc/rndc.key:
run the following command and start the service again
# rndc-confgen -a -r /dev/urandom
wrote key file "/etc/rndc.key"

# /etc/init.d/named start
Once started verify that DNS service is listening on port (53 in this case) defined on the names.conf file
netstat -ntl
tcp        0      0 192.168.0.104:53            0.0.0.0:*                   LISTEN
tcp        0      0 127.0.0.1:53                0.0.0.0:*                   LISTEN


8. Using dig command verify that forward lookup and reverse lookup are functioning properly.
# dig hpc6.mydomain.net

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.17.rc1.el6 <<>> hpc6.mydomain.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 52260
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;hpc6.mydomain.net.              IN      A

;; ANSWER SECTION:
hpc6.mydomain.net.       86400   IN      A       192.168.0.104

;; AUTHORITY SECTION:
mydomain.net.            86400   IN      NS      hpc6.mydomain.net.

;; Query time: 0 msec
;; SERVER: 192.168.0.104#53(192.168.0.104)
;; WHEN: Wed Aug 28 16:12:44 2013
;; MSG SIZE  rcvd: 64
Check if the query status is NOERROR and answer is 1. This confirms that query was answered without any errors. Check the reverse look as well
# dig -x 192.168.0.104

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.17.rc1.el6 <<>> -x 192.168.0.104
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30913
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1

;; QUESTION SECTION:
;104.0.168.192.in-addr.arpa.    IN      PTR

;; ANSWER SECTION:
104.0.168.192.in-addr.arpa. 3600 IN     PTR     hpc6.mydomain.net.

;; AUTHORITY SECTION:
0.168.192.in-addr.arpa. 3600    IN      NS      hpc6.mydomain.net.

;; ADDITIONAL SECTION:
hpc6.mydomain.net.       86400   IN      A       192.168.0.104

;; Query time: 0 msec
;; SERVER: 192.168.0.104#53(192.168.0.104)
;; WHEN: Wed Aug 28 16:13:09 2013
;; MSG SIZE  rcvd: 104
9. If there are errors recheck the zone files and correct any mistakes. Once the dig command return no error then run the lookup command to see the name resolved to 3 IPs in a round robin fashion.
# nslookup rac-scan # lookup 1
Server:         192.168.0.104
Address:        192.168.0.104#53

Name:   rac-scan.mydomain.net
Address: 192.168.0.94
Name:   rac-scan.mydomain.net
Address: 192.168.0.86
Name:   rac-scan.mydomain.net
Address: 192.168.0.93

# nslookup rac-scan # lookup 2
Server:         192.168.0.104
Address:        192.168.0.104#53

Name:   rac-scan.mydomain.net
Address: 192.168.0.86
Name:   rac-scan.mydomain.net
Address: 192.168.0.93
Name:   rac-scan.mydomain.net
Address: 192.168.0.94

# nslookup rac-scan # lookup 3
Server:         192.168.0.104
Address:        192.168.0.104#53

Name:   rac-scan.mydomain.net
Address: 192.168.0.93
Name:   rac-scan.mydomain.net
Address: 192.168.0.94
Name:   rac-scan.mydomain.net
Address: 192.168.0.86
10. To use it in a RAC configuration edit the /etc/resolve.conf file of the RAC nodes and add the IP of the server where SCAN is setup
cat /etc/resolve.conf 

search mydomain.net
nameserver 192.168.0.104
Related Post
GNS Setup for RAC

Wednesday, December 4, 2013

Excessive Audit File (*.aud) Generation

Excessive amount of audit file generation was observed in a 11gR2 RAC environment. The audit trail parameter was set to OS and adump directory was getting populated with *.aud files at a rate of 60 per second. Audit files were cleaned out using cron (1298957.1) and schedule frequency wasn't enough to reduce the amount of *.aud file to an acceptable level. With 11gR2 (also available in 10.2.0.5 and 11.1.0.7) it is possible to use DBMS_AUDIT_MGMT to purge audit records including OS audit files (731908.1).
Starting with 11gR1 audit file generation has changed to create a new audit file for new session instead of appending to an existing audit file (1474823.1). This would generate lot of small (in size) audit files and could see burst of audit file generation for 3 tier application when connection pools are initiated. Also by default audit is enable of several actions. These could be found with the scripts on 1019552.6 or 287436.1. Below is the output from running script (1019552.6) on a new 11.2.0.4 RAC DB (only audit trail is changed to OS).
SQL> @tstaudit
Press return to see the audit related parameters...

NAME                 DISPLAY_VALUE
-------------------- --------------------
audit_file_dest      /opt/app/oracle/admin/std11g2/adump

audit_sys_operations FALSE
audit_syslog_level
audit_trail          OS

System auditing options across the system and by user

User name    Proxy name   Audit Option                   SUCCESS    FAILURE
------------ ------------ ------------------------------ ---------- ----------
                          ALTER ANY PROCEDURE            BY ACCESS  BY ACCESS
                          ALTER ANY TABLE                BY ACCESS  BY ACCESS
                          ALTER DATABASE                 BY ACCESS  BY ACCESS
                          ALTER PROFILE                  BY ACCESS  BY ACCESS
                          ALTER SYSTEM                   BY ACCESS  BY ACCESS
                          ALTER USER                     BY ACCESS  BY ACCESS
                          CREATE ANY JOB                 BY ACCESS  BY ACCESS
                          CREATE ANY LIBRARY             BY ACCESS  BY ACCESS
                          CREATE ANY PROCEDURE           BY ACCESS  BY ACCESS
                          CREATE ANY TABLE               BY ACCESS  BY ACCESS
                          CREATE EXTERNAL JOB            BY ACCESS  BY ACCESS
                          CREATE PUBLIC DATABASE LINK    BY ACCESS  BY ACCESS
                          CREATE SESSION                 BY ACCESS  BY ACCESS
                          CREATE USER                    BY ACCESS  BY ACCESS
                          DATABASE LINK                  BY ACCESS  BY ACCESS
                          DIRECTORY                      BY ACCESS  BY ACCESS
                          DROP ANY PROCEDURE             BY ACCESS  BY ACCESS
                          DROP ANY TABLE                 BY ACCESS  BY ACCESS
                          DROP PROFILE                   BY ACCESS  BY ACCESS
                          DROP USER                      BY ACCESS  BY ACCESS
                          EXEMPT ACCESS POLICY           BY ACCESS  BY ACCESS
                          GRANT ANY OBJECT PRIVILEGE     BY ACCESS  BY ACCESS
                          GRANT ANY PRIVILEGE            BY ACCESS  BY ACCESS
                          GRANT ANY ROLE                 BY ACCESS  BY ACCESS
                          PROFILE                        BY ACCESS  BY ACCESS
                          PUBLIC SYNONYM                 BY ACCESS  BY ACCESS
                          ROLE                           BY ACCESS  BY ACCESS
                          SYSTEM AUDIT                   BY ACCESS  BY ACCESS
                          SYSTEM GRANT                   BY ACCESS  BY ACCESS

29 rows selected.

Press return to see auditing options on all objects...

no rows selected
Press return to see audit trail... Note that the query returns the audit data for the last day only

no rows selected
Press return to see system privileges audited across the system and by user...
As seen from above output one of the audit option is "CREATE SESSION". As changing audit trail requires a restart of the database (RAC allows rolling restarts) it was decided to remove the audit on create session. This reduced the amount of audit file generated but still could see burst of audit files being generated every 5 second. Having examined the audit file it was now clear no audit files were generated for non sys users after removing audit on create session. Only audit files now generated are for sys users and only had following content on them
Thu Nov 28 14:32:12 2013 +01:00
LENGTH : '155'
ACTION :[7] 'CONNECT'
DATABASE USER:[1] '/'
PRIVILEGE :[6] 'SYSDBA'
CLIENT USER:[6] 'oracle'
CLIENT TERMINAL:[0] ''
STATUS:[1] '0'
DBID:[10] '2765679112'



Audit for various sys user operations are created by default irrespective of the configuration setting (308066.1). Audit files could also be generated by grid control agent (1196323.1) which wasn't applicable in this case as there was no grid control or agent running. Another possible cause is due to the resource health check by the grid infrastructure (1378881.1). Resource check interval was set to 1 second and if this had been the cause the audit file generation frequency would have been every 1 second instead of every 5 second. So this wasn't the cause either. Metalink note 1171314.1 list possible causes for excessive audit file generation and classify them as expected behavior, due to audit setting or due to bug but it still didn't provide any help in diagnosing the cause of this audit file generation.
There was no application or cron jobs that were executed on the OS as sys user. So it was decided to check where the sys connections originating from.
SQL>  select inst_id,sid,serial#,program,machine,USERNAME ,sql_id from gv$session where username='SYS' order by 1;

   INST_ID        SID    SERIAL# PROGRAM                               MACHINE      USERN   SQL_ID
---------- ---------- ---------- -----------------------------------   ---------    ------- -------------
         1        569          5 oraagent.bin@dm-db1.hps (TNS V1-V3)   dm-db1.hps   SYS
         1       2566       6317 oracle@dm-db1.hps (O001)              dm-db1.hps   SYS
         1       3683        561 oraagent.bin@dm-db1.hps (TNS V1-V3)   dm-db1.hps   SYS    4qm8a3w6a1rfd
         1       3967          1 oraagent.bin@dm-db1.hps (TNS V1-V3)   dm-db1.hps   SYS
         2        569          3 oraagent.bin@dm-db2.hps (TNS V1-V3)   dm-db2.hps   SYS
         2       3682          1 oraagent.bin@dm-db2.hps (TNS V1-V3)   dm-db2.hps   SYS

6 rows selected.
Sys sessions with program oraagent.bin@hostname are expected in RAC environment (except for bug in 11.2.0.2, refer 1307139.1) and does not cause excessive audit file generation. Only thing out of place was O001 process. Looking at the logon time for this session it showed been logon while back and similar systems (11.2.0.3 RAC) didn't have this process logon for long periods as in this case. From the Oracle reference document states "Onnn slave processes are spawned on demand. These processes communicate with the ASM instance. Maintains a connection to the ASM instance for metadata operations". According to 1556564.1 Onnn processes are spawned and terminated automatically and could be killed of if required and system will re-spawn it when needed and killing it will not effect database operations.
However it was decided to do a cluster stack restart on the node 1 during system maintenance. After the restart the Onnn process was not there and audit file generation went back to normal, burst of audit file generation every 5 second wasn't there anymore.

Useful metalink note
How to Setup Auditing [ID 1020945.6]
11g: Possible reasons for many OS audit trail (.aud) files, <1KB in size [ID 1474823.1]
Large Number of Audit Files Generated by Oracle Restart or Grid Infrastructure [ID 1378881.1]
Large Number Of Audit Files Generated During Rman Backups [ID 1211396.1]
AUDIT_SYS_OPERATIONS Set To FALSE Yet Audit Files Are Generated [ID 308066.1]
Huge/Large/Excessive Number Of Audit Records Are Being Generated In The Database [ID 1171314.1]
SCRIPT: Generate AUDIT and NOAUDIT Statements for Current Audit Settings [ID 287436.1]
How does the NOAUDIT option work (Disabling AUDIT commands)[ID 1068714.6]
A Lot of Audit Files in ASM Home [ID 813416.1]
Many OS Audit Files Produced By The Grid Control Agent Connections [ID 1196323.1]
New Feature DBMS_AUDIT_MGMT To Manage And Purge Audit Information [ID 731908.1]
Manage Audit File Directory Growth with cron [ID 1298957.1]
Script to Show Audit Options/Audit Trail [ID 1019552.6]
AUDIT_TRAIL Set to DB yet Some Audited Entries for non-Sysdba Users Are Created in the OS Trail. [ID 1279934.1]
The Column DBA_PRIV_AUDIT_OPTS Has Rows With USER_NAME 'ANY CLIENT' and PROXY_NAME NULL [ID 455565.1]
High Load On Server from Process Ora_onnn [ID 1556564.1]