Monday, July 23, 2018

Start of Service Fails with ORA-16000 on Physical Standby Open for Read Only

Start of database service failed with ora-16000 on physical standby where both CDB and PDB are open read only mode. The DG setup is same one mentioned in earlier post Data Guard on 12.2 CDB.
SQL> show con_name

CON_NAME
----------
CDB$ROOT

SQL> select open_mode from v$database;

OPEN_MODE
--------------------
READ ONLY WITH APPLY

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDBAPP1                        READ ONLY  NO

srvctl add service -db stbycdb -pdb pdbapp1 -service abc -role PHYSICAL_STANDBY -notification TRUE -failovertype NONE -failovermethod NONE -failoverdelay 0 -failoverretry 0
srvctl start service -db stbycdb -s abc
PRCD-1084 : Failed to start service abc
PRCR-1079 : Failed to start resource ora.stbycdb.abc.svc
CRS-5017: The resource action "ora.stbycdb.abc.svc start" encountered the following error:
ORA-16000: database or pluggable database open for read-only access
ORA-06512: at "SYS.DBMS_SERVICE", line 5
ORA-06512: at "SYS.DBMS_SERVICE", line 288
ORA-06512: at line 1
. For details refer to "(:CLSN00107:)" in "/opt/app/oracle/diag/crs/city7s/crs/trace/ohasd_oraagent_grid.trc".

CRS-2674: Start of 'ora.stbycdb.abc.svc' on 'city7s' failed
Reason seems to be creating the service with physical standby role is trying to add some rows to the read only database. To fix problem first add the service to the primary with role as physical standby and start the service. It may seems odd to start a database service that it is defined for a physical standby role but that's what needed to resolve this. Once the service is started stop it on the primary before the next steps.
srvctl add service -db prodcdb -pdb pdbapp1 -service abc -role PHYSICAL_STANDBY -notification TRUE -failovertype NONE -failovermethod NONE -failoverdelay 0 -failoverretry 0
srvctl start service -db prodcdb -s abc

lsnrctl status

...
  Instance "prodcdb", status READY, has 1 handler(s) for this service...
Service "prodcdb_DGB" has 1 instance(s).
  Instance "prodcdb", status READY, has 1 handler(s) for this service...
Service "prodcdb_DGMGRL" has 1 instance(s).
  Instance "prodcdb", status UNKNOWN, has 1 handler(s) for this service...
Service "abc" has 1 instance(s).
  Instance "prodcdb", status READY, has 1 handler(s) for this service...
The command completed successfully

srvctl stop service -db prodcdb -s abc


Do few log switches and wait until these logs are applied on the standby. Once logs are applied on standby create and start the service on standby.
srvctl add service -db stbycdb -pdb pdbapp1 -service abc -role PHYSICAL_STANDBY -notification TRUE -failovertype NONE -failovermethod NONE -failoverdelay 0 -failoverretry 0
srvctl start service -db stbycdb -s abc

lsnrctl status

...
  Instance "stbycdb", status READY, has 1 handler(s) for this service...
Service "stbycdb_DGB" has 1 instance(s).
  Instance "stbycdb", status READY, has 1 handler(s) for this service...
Service "stbycdb_DGMGRL" has 1 instance(s).
  Instance "stbycdb", status UNKNOWN, has 1 handler(s) for this service...
Service "abc" has 1 instance(s).
  Instance "stbycdb", status READY, has 1 handler(s) for this service...
The command completed successfully
Useful metalink notes
ORA-16000 Cannot Enable Auto Open of PDB On Physical Standby [ID 2377174.1]
How to create a RAC Database Service With Physical Standby Role Option? [ID 1129143.1]

Saturday, July 14, 2018

RAC on Docker - Single Host Setup

Similar to single instance Oracle DB on docker, RAC too could be deployed on docker. However, unlike single instance version the current RAC on docker is only for testing and development work only. It's not supported for production where as single instance deployments on docker gets support at severity 2 service requests and lower.
Oracle RAC on docker could be done in number of ways. This post shows the steps for setting up RAC on docker where all docker containers are running on a single host. Also the post shows use of block device as the storage medium for RAC shared storage. Other option is to use docker RAC storage server container. The docker image creation for RAC requires roughly 35G to create the RAC image. The setup in this post used a single VirtualBox machine with 80GB storage and a separate disk of 65GB as the shared storage disk. The host server had Oracle Linux 7.4.
cat /etc/oracle-release
Oracle Linux Server release 7.4
uname -r
4.1.12-94.3.9.el7uek.x86_64
1. Install docker engine in the host server. The steps are similar to that of installing docker in RHEL7. Docker info is given below
 docker info
Containers: 3
 Running: 0
 Paused: 0
 Stopped: 3
Images: 35
Server Version: 18.03.1-ce
Storage Driver: overlay2
 Backing Filesystem: xfs
 Supports d_type: true
 Native Overlay Diff: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 773c489c9c1b21a6d78b5c538cd395416ec50f88
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.1.12-94.3.9.el7uek.x86_64
Operating System: Oracle Linux Server 7.4
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 11.33GiB
Name: oel7.domain.net
ID: 67NS:D323:GVZE:SNH4:Q3SX:5ZUF:SQSW:V57Y:ENTV:LKS2:TSZL:2DPR
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
2. Download docker files from GitHub and unzip the master file same as previous post.
3. If the network used for RAC is not available outside the host then to access the RAC database connection manager (CMAN) must be used. The network segment used for RAC in this post is not available out side the host. Therefore as the first step create a CMAN image. Assuming docker files were unzip to /opt/git the CMAN docker files are available in
/opt/git/docker-images-master/OracleDatabase/RAC/OracleConnectionManager
Copy oracle 12cR2 client installer zip file to dockerfile directory.
cp linuxx64_12201_client.zip /opt/git/docker-images-master/OracleDatabase/RAC/OracleConnectionManager/dockerfiles/12.2.0.1
Build the CMAN image by running the build docker image script end of which CMAN image would be created.
cd /opt/git/docker-images-master/OracleDatabase/RAC/OracleConnectionManager/dockerfiles
./buildDockerImage.sh -v 12.2.0.1

docker images
REPOSITORY           TAG                 IMAGE ID            CREATED             SIZE
oracle/client-cman   12.2.0.1            5bd15d3f5ba2        2 minutes ago       4.58GB
oraclelinux          7-slim              c94cc930790a        2 months ago        117MB
Next is to create the CMAN container. Before creating the container create the docker network bridge. As CMAN is used to access the RAC DB, this network must be the public network used in RAC. For this post 192.168.2.* is chosen as the public network segment and the network is named rac_pub_nw.
docker network create --driver=bridge --subnet=192.168.2.0/24 rac_pub_nw
When creating the CMAN container, SCAN IP used for RAC setup must be specified. Again the IPs used for this must be of the public network.
docker run -d --hostname rac-cman --dns-search=domain.net \
  --network=rac_pub_nw --ip=192.168.2.94 \
  -e DOMAIN=domain.net -e PUBLIC_IP=192.168.2.94 \
  -e PUBLIC_HOSTNAME=rac-cman -e SCAN_NAME=rac-scan \
  -e SCAN_IP=192.168.2.135 --privileged=false \
  -p 1521:1521 --name rac-cman oracle/client-cman:12.2.0.1
This will create and start the CMAN container. The log tail will indicate when CMAN is ready to use.
docker container ls -a
CONTAINER ID        IMAGE                         ...    PORTS                              NAMES
1b5b779a96e6        oracle/client-cman:12.2.0.1   ...    0.0.0.0:1521->1521/tcp, 5500/tcp   rac-cman

docker logs -f rac-cman

06-27-2018 14:47:19 UTC :  : cman started sucessfully
06-27-2018 14:47:19 UTC :  : ################################################
06-27-2018 14:47:19 UTC :  :  CONNECTION MANAGER IS READY TO USE!
06-27-2018 14:47:19 UTC :  : ################################################
06-27-2018 14:47:19 UTC :  : cman started sucessfully
Any connection that comes on the host port 1521 is forwarded to CMAN port 1521. This could be also listed through the DOCKER iptable chain
iptables -L DOCKER -n
Chain DOCKER (3 references)
target     prot opt source               destination
ACCEPT     tcp  --  0.0.0.0/0            192.168.2.94         tcp dpt:1521
Before proceeding to next step make sure that port 1521 is accessible from outside the host(using telnet and etc). In this case the host IP is 192.168.0.93.
telnet 192.168.0.93 1521
Trying 192.168.0.93...
Connected to 192.168.0.93.
Escape character is '^]'.
4. Docker container gets certain kernel parameter values from host. Set the following kernel parameters at host level
fs.file-max = 6815744
net.core.rmem_max = 4194304
net.core.rmem_default = 262144
net.core.wmem_max = 1048576
net.core.wmem_default = 262144
net.core.rmem_default = 262144
5.To build the RAC image copy both database and grid installer files and the RAC on docker patch 27383741 to OracleRealApplicationClusters docker file location.
cp linuxx64_12201_database.zip ../OracleDatabase/RAC/OracleRealApplicationClusters/dockerfiles/12.2.0.1
cp linuxx64_12201_grid_home.zip ../OracleDatabase/RAC/OracleRealApplicationClusters/dockerfiles/12.2.0.1
cp p27383741_122010_Linux-x86-64.zip ../OracleDatabase/RAC/OracleRealApplicationClusters/dockerfiles/12.2.0.1
5. Run the image build to create the RAC image. The build took close to 30 mins to complete.
cd /opt/git/docker-images-master/OracleDatabase/RAC/OracleRealApplicationClusters/dockerfiles

./buildDockerImage.sh -v 12.2.0.1
...
Oracle Database Docker Image for Real Application Clusters (RAC) version 12.2.0.1 is ready to be extended:

    --> oracle/database-rac:12.2.0.1

  Build completed in 1632 seconds.

docker images
REPOSITORY            TAG                 IMAGE ID            CREATED             SIZE
oracle/database-rac   12.2.0.1            4cf55a528f6f        11 seconds ago      26.3GB
oracle/client-cman    12.2.0.1            e95d7b894f03        31 minutes ago      4.58GB
oraclelinux           7-slim              c94cc930790a        2 months ago        117MB
6. After the RAC image is created the first RAC node (container) could be created. This step gives few pointers to look out of when creating the RAC containers.
Make sure the CMAN container running when creating the RAC container. Certain pre crs installation checks run by cluvfy must pass for container to get created (when docker start is run). For example if the SWAP space is not at the size expected the RAC creation will fail and logs will output the following.
06-27-2018 12:07:38 UTC :  : Performing Cluvfy Checks
06-27-2018 12:09:12 UTC :  : Cluster Verfication Check failed! Removing failure statement related to /etc/resov.conf, DNS and ntp.conf checks as you may not have DNS or NTP Server
06-27-2018 12:09:12 UTC :  : Checking Again /tmp/cluvfy_check.txt
06-27-2018 12:09:12 UTC : : Pre Checks failed for Grid installation, please check /tmp/cluvfy_check.txt
06-27-2018 12:09:12 UTC : : Error has occurred in Grid Setup, Please verify!
Login into the container and checking /tmp/cluvfy_check.txt will show what has failed. In this example it was SWAP size
Verifying Swap Size ...FAILED
rac1: PRVF-7573 : Sufficient swap size is not available on node "rac1"
      [Required = 11.326GB (1.1876152E7KB) ; Found = 3GB (3145724.0KB)]
This could be overcome by creating a correct size SWAP partition on the host or adding a swap file for the duration of the RAC node creation.
Secondly start of RAC container could fail with the following
Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused
"process_linux.go:279: applying cgroup configuration for process caused \"failed to write 95000 to 
cpu.rt_runtime_us: write /sys/fs/cgroup/cpu,cpuacct/docker/913a8506ec3f36bd5233dec8dc0596013c15021dbf5d64dd9558edd71e9aca28/cpu.rt_runtime_us: 
invalid argument\"": unknown Error: failed to start containers: rac1
To fix this change the value on the host and set it to real time mode to 95000 per container
echo 95000 > /sys/fs/cgroup/cpu,cpuacct/docker/cpu.rt_runtime_us
If value is set to both nodes at the same time (as this is a single host deployment) then set to 190000
echo 190000 > /sys/fs/cgroup/cpu,cpuacct/docker/cpu.rt_runtime_us
RAC installation will ignore certain pre-reqs and continue to grid install.
06-27-2018 12:16:10 UTC :  : Cluster Verfication Check failed! Removing failure statement related to /etc/resov.conf, DNS and ntp.conf checks as you may not have DNS or NTP Server
06-27-2018 12:16:10 UTC :  : Checking Again /tmp/cluvfy_check.txt
06-27-2018 12:16:10 UTC :  : Pre Checks failed for Grid installation, ignoring failure related to SCAN and /etc/resolv.conf
06-27-2018 12:16:10 UTC :  : Running Grid Installation
Other than the ones that are auto ignored, any other pre-reqs that fail must be corrected. Also noted was the fact, if for some reason node creation fail then drop and recreate the container. Re-running container creation seems to not work even after the failed pre-reqs is corrected.



7. Now the steps for creating and starting the first RAC node. Create the docker network bridge for private interconnect. In this case the private network range is not available outside the host
docker network create --driver=bridge --subnet=192.168.1.0/24 rac_pvt_nw
The network created during CMAN creation will also be used the public network for RAC node. As such no need to create another public network.
All RAC nodes share the host information via host file. Create a directory path and an empty file for the host file that will be used by the RAC nodes to map /etc/hosts.
mkdir -p /opt/app/racdata
chmod 777 /opt/app/racdata
cd /opt/app/racdata/
touch host_file
chmod 666 host_file
Set write permission for other group for the block device used as the shared disk.
chmod o+rw /dev/sdb1
ls -l /dev/sdb*
brw-rw----. 1 root disk 8, 16 Jun 21 15:54 /dev/sdb
brw-rw-rw-. 1 root disk 8, 17 Jun 21 15:54 /dev/sdb1
Create the container for the first node. No RAC related installation is carried out until this container is started. The VIP and public IP are chosen from the public network range (192.168.2.*) which was created during CMAN container creation. The SCAN name and IP as well as CMAN name and IP are same as ones used during CMAN container creation. The CDB is named oracdb and PDB is named orapdb.
docker create -t -i --hostname rac1 \
--volume /boot:/boot:ro \
--volume /dev/shm --tmpfs /dev/shm:rw,exec,size=4G \
--volume /opt/app/racdata/host_file:/etc/hosts \
--dns-search=domain.net \
--device=/dev/sdb1:/dev/asm_disk1 --privileged=false \
--cap-add=SYS_ADMIN --cap-add=SYS_NICE \
--cap-add=SYS_RESOURCE --cap-add=NET_ADMIN \
-e NODE_VIP=192.168.2.97  -e VIP_HOSTNAME=rac1-vip  \
-e PRIV_IP=192.168.1.85  -e PRIV_HOSTNAME=rac1-pvt \
-e PUBLIC_IP=192.168.2.85 -e PUBLIC_HOSTNAME=rac1 \
-e SCAN_NAME=rac-scan -e SCAN_IP=192.168.2.135  \
-e OP_TYPE=INSTALL -e DOMAIN=domain.net \
-e ASM_DEVICE_LIST=/dev/asm_disk1 \
-e ORACLE_SID=oracdb -e ORACLE_PDB=orapdb \
-e ORACLE_PWD="orarac12c" -e ASM_DISCOVERY_DIR=/dev \
-e CMAN_HOSTNAME=rac-cman -e OS_PASSWORD=orarac12c \
-e CMAN_IP=192.168.2.94 \
--restart=always --tmpfs=/run -v /sys/fs/cgroup:/sys/fs/cgroup:ro \
--cpu-rt-runtime=95000 --ulimit rtprio=99  \
--name rac1 oracle/database-rac:12.2.0.1
Next assign the networks to the RAC node. The IP specified here is the same IP used above when creating the node.
docker network disconnect  bridge rac1
docker network connect rac_pub_nw --ip 192.168.2.85 rac1
docker network connect rac_pvt_nw --ip 192.168.1.85  rac1
Finally start the container.
docker start rac1
If all the pre-reqs are fine then this will create the first RAC node.
docker logs -f rac1

07-05-2018 16:10:53 UTC :  : #################################################################
07-05-2018 16:10:53 UTC :  :  Oracle Database oracdb is up and running on rac1
07-05-2018 16:10:53 UTC :  : #################################################################
07-05-2018 16:10:53 UTC :  : Running User Script
07-05-2018 16:10:54 UTC :  : Setting Remote Listener
07-05-2018 16:10:54 UTC :  : 192.168.2.94
07-05-2018 16:10:54 UTC :  : Executing script to set the remote listener
07-05-2018 16:10:58 UTC :  : ####################################
07-05-2018 16:10:58 UTC :  : ORACLE RAC DATABASE IS READY TO USE!
07-05-2018 16:10:58 UTC :  : ####################################
8. Next add the second node. In this case too certain pre-reqs failures will be ignored and node addition will continue.
07-09-2018 09:12:18 UTC :  : Running Cluster verification utility for new node rac2 on rac1
07-09-2018 09:13:44 UTC :  : Cluster Verfication Check failed! Removing failure statement related to /etc/resov.conf, DNS and ntp.conf checks as DNS may  not be setup and CTSSD process will take care of time synchronization
07-09-2018 09:13:44 UTC :  : Checking Again /tmp/cluvfy_check.txt
07-09-2018 09:13:44 UTC :  : Pre Checks failed for Grid installation, ignoring failure related to SCAN and /etc/resolv.conf
07-09-2018 09:13:44 UTC :  : Running Node Addition and cluvfy test for node rac2
Assign private IP, VIP and public IP same as before from respective network segments. The only difference in this case the operation type parameter which specify this is an node addition.
docker create -t -i --hostname rac2 \
--volume /dev/shm --tmpfs /dev/shm:rw,exec,size=4G  \
--volume /boot:/boot:ro \
--volume /opt/app/racdata/host_file:/etc/hosts \
--dns-search=domain.net \
--device=/dev/sdb1:/dev/asm_disk1 --privileged=false \
--cap-add=SYS_ADMIN --cap-add=SYS_NICE \
--cap-add=SYS_RESOURCE --cap-add=NET_ADMIN \
-e EXISTING_CLS_NODES=rac1 -e OS_PASSWORD=orarac12c \
-e NODE_VIP=192.168.2.98  -e VIP_HOSTNAME=rac2-vip  \
-e PRIV_IP=192.168.1.86  -e PRIV_HOSTNAME=rac2-pvt \
-e PUBLIC_IP=192.168.2.86 -e PUBLIC_HOSTNAME=rac2 \
-e DOMAIN=domain.net -e SCAN_NAME=rac-scan \
-e SCAN_IP=192.168.2.135 -e ASM_DISCOVERY_DIR=/dev \
-e ASM_DEVICE_LIST=/dev/asm_disk1,/dev/asm_disk2 \
-e ORACLE_SID=oracdb -e OP_TYPE=ADDNODE \
--tmpfs=/run -v /sys/fs/cgroup:/sys/fs/cgroup:ro \
--cpu-rt-runtime=95000 --ulimit rtprio=99  --restart=always \
--name rac2 oracle/database-rac:12.2.0.1
Assign networks to the RAC node
docker network disconnect  bridge rac2
docker network connect rac_pub_nw --ip 192.168.2.86 rac2
docker network connect rac_pvt_nw --ip 192.168.1.86  rac2
Finally start the container to carry out the node addition.
docker start rac2
Once the node is addition complete the logs fill indicate it's ready for use.
07-09-2018 09:25:59 UTC :  : #################################################################
07-09-2018 09:25:59 UTC :  :  Oracle Database oracdb is up and running on rac2
07-09-2018 09:25:59 UTC :  : #################################################################
07-09-2018 09:25:59 UTC :  : Running User Script
07-09-2018 09:25:59 UTC :  : Setting Remote Listener
07-09-2018 09:25:59 UTC :  : ####################################
07-09-2018 09:25:59 UTC :  : ORACLE RAC DATABASE IS READY TO USE!
07-09-2018 09:25:59 UTC :  : ####################################
9. To connect to the RAC DB use the host IP, not the CMAN or SCAN IP. As mentioned earlier incoming connections to port 1521 on host IP are forwarded by CMAN. In this case the host IP was 192.168.0.93. To connect using SQL Plus to PDB could use the following.
sqlplus  sys/orarac12c@192.168.0.93:1521/orapdb as sysdba

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         3 ORAPDB                         READ WRITE NO
10. This concludes the creation of RAC on docker using a single host.
 docker container ls
CONTAINER ID        IMAGE                         ...     PORTS                              NAMES
f488759bd0d6        oracle/database-rac:12.2.0.1  ...                                        rac2
1fc45207d827        oracle/database-rac:12.2.0.1  ...                                        rac1
dd14575f282c        oracle/client-cman:12.2.0.1   ...     0.0.0.0:1521->1521/tcp, 5500/tcp   rac-cman
The setup has taken roughly 37GB of space.
 df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3        80G   11G   69G  13% /  <-- before RAC setup
/dev/sda3        80G   48G   33G  60% /  <-- after RAC setup
During subsequent restart of the RAC nodes, the start up will detect grid already being configured and existing cluster will be started.
07-09-2018 10:33:24 UTC :  : ###################################################
07-09-2018 10:33:24 UTC :  : Checking if grid is already configured
07-09-2018 10:33:24 UTC :  : Grid is installed on rac1. runOracle.sh will start the Grid service
07-09-2018 10:33:24 UTC :  : Setting up Grid Env for Grid Start
07-09-2018 10:33:24 UTC :  : ##########################################################################################
07-09-2018 10:33:24 UTC :  : Grid is already installed on this container! Grid will be started by default ohasd scripts
07-09-2018 10:33:24 UTC :  : ############################################################################################
Related Posts
Installing Docker CE on RHEL 7
Create, Plug/Unplug, Patch, Export/Import and Backup Oracle DB in Docker

Sunday, July 1, 2018

JDBC Client Failover in Data Guard Configuration with PDBs

This post gives the highlights of setting up JDBC client failover in a data guard configuration with PDBs. For comprehensive set of steps refer the following white papers for 12c and for 11g.
The post shows how JDBC could be setup in an application that connects to single instance database in an Oracle restart such that JDBC connection failover to standby when a switchover or failover happens. The data guard setup used for this case is the same setup mentioned in the earlier post oracle Data Guard on 12.2 CDB with Oracle Restart.
1. By default ONS is disabled and offline in Oracle restart. In order to send FAN events ONS must be enabled and started in Oracle restart. This should be done in both primary and standby nodes.
srvctl enable ons
srvctl start ons
Once done check ONS status on both primary and standby.
crsctl stat res ora.ons
NAME=ora.ons
TYPE=ora.ons.type
TARGET=ONLINE
STATE=ONLINE on city7

crsctl stat res ora.ons
NAME=ora.ons
TYPE=ora.ons.type
TARGET=ONLINE
STATE=ONLINE on city7s
When ONS is enabled, stopping HAS throws up following error
crsctl stop has
...
CRS-2673: Attempting to stop 'ora.ons' on 'city7s'
CRS-5014: Agent "ORAAGENT" timed out starting process "/opt/app/oracle/product/12.2.0/grid/opmn/bin/onsctli" for action "stop": details at "(:CLSN00009:)" in "/opt/app/oracle/diag/crs/city7s/crs/trace/ohasd_oraagent_grid.trc"
CRS-2675: Stop of 'ora.ons' on 'city7s' failed
CRS-2679: Attempting to clean 'ora.ons' on 'city7s'
CRS-2681: Clean of 'ora.ons' on 'city7s' succeeded
This only happens during stopping of HAS and no such issue during start up of HAS and ONS service gets started along with other services. This appear to be a known issue in other version relating to RAC but nothing could be found on MOS with regard to 12.2 Oracle restart. SR was raised and this is being investigated under bug 28134413. In spite of this issue the failover works as expected.
Update: 2020-01-28 - As a result of the SR raised Oracle has created MOS doc 2631403.1 which now states this is expected behavior on SIHA.

2. Create a service and associate it with a PDB for application to connect. It's important that application connect to the database using the service for failover to work in the event of role transition. When a service is created for PDB, the PDB could be brought up by starting the service. However, stopping the service doesn't bring down the PDB but only the service is stopped. Following service was created on primary PDB.
srvctl add service -db prodcdb -pdb pdbapp1 -service devsrv -role PRIMARY -notification TRUE -failovertype NONE -failovermethod NONE -failoverdelay 0 -failoverretry 0 

srvctl config service -d prodcdb -s devsrv

Service name: devsrv
Cardinality: SINGLETON
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: true
Global: false
Commit Outcome: false
Failover type: NONE
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Failover restore: NONE
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Edition:
Pluggable database name: pdbapp1
Maximum lag time: ANY
SQL Translation Profile:
Retention: 86400 seconds
Replay Initiation Time: 300 seconds
Drain timeout:
Stop option:
Session State Consistency: DYNAMIC
GSM Flags: 0
Service is enabled
Following service was created on standby PDB to be active when standby becomes primary. The service name must be same as the service created in the primary.
srvctl add service -db stbycdb -pdb pdbapp1 -service devsrv -role PRIMARY -notification TRUE -failovertype NONE -failovermethod NONE -failoverdelay 0 -failoverretry 0

srvctl config service -d stbycdb -s devsrv
Service name: devsrv
Cardinality: SINGLETON
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: true
Global: false
Commit Outcome: false
Failover type: NONE
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Failover restore: NONE
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Edition:
Pluggable database name: pdbapp1
Maximum lag time: ANY
SQL Translation Profile:
Retention: 86400 seconds
Replay Initiation Time: 300 seconds
Drain timeout:
Stop option:
Session State Consistency: DYNAMIC
GSM Flags: 0
Service is enabled
Once service is created make sure patch for bug 26439462 (Doc ID 26439462.8) is applied. This bug prevents the bringing up of the PDB service automatically after role transition. Applying the latest RU for 12.2 (at the time of the testing it was 12.2.0.1.180417) resolved this issue. If the PDB service doesn't automatically starts then the JDBC failover will fail. This could be tested by carrying out a switchover and checking if the PDB service automatically comes up after role transition.

3. Configure the JDBC client to use UCP and enable FCF by setting ONS configuration settings. Details of this could be found on the 12c1 white paper and high availability best practice guide.



4. Create a TNS entry containing both primary and standby hosts and the service name created earlier. Use this TNS entry to connect to the database. To avoid ORA-12514 set (RETRY_COUNT x RETRY_DELAY) such that it is slightly higher than the total time for the switchover and to start the service.
DGTNS =
  (DESCRIPTION =
    (FAILOVER = on)(CONNECT_TIMEOUT=60)(RETRY_COUNT=40)(RETRY_DELAY=2)(TRANSPORT_CONNECT_TIMEOUT=1)
    (ADDRESS_LIST =
      (LOAD_BALANCE = yes)
      (ADDRESS = (PROTOCOL = TCP)(HOST = city7.domain.net)(PORT = 1581))
      (ADDRESS = (PROTOCOL = TCP)(HOST = city7s.domain.net)(PORT = 1581))
    )
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = devsrv)
    )
  )
5. Start the application and verify application server IP is listed in the ONS subscription list in each database server.

6. Do a switchover and check the application connectivity. For testing purpose a java application was created to output the connected database. The output below shows that initially it was connected to the PDB in the stbycdb CDB which was primary at that time. During the switchover time period while the standby DB is made primary and PDB and associated service is started connection could error. Once the service is up the JDBC connections succeeds.
Connected to stbycdb DB Server hpc1.domain.net Application Server on Mon May 21 09:55:50 BST 2018
Connected to stbycdb DB Server hpc1.domain.net Application Server on Mon May 21 09:55:51 BST 2018
Connected to stbycdb DB Server hpc1.domain.net Application Server on Mon May 21 09:55:52 BST 2018
Connected to prodcdb DB Server hpc1.domain.net Application Server on Mon May 21 09:56:10 BST 2018
Connected to prodcdb DB Server hpc1.domain.net Application Server on Mon May 21 09:56:11 BST 2018
Connected to prodcdb DB Server hpc1.domain.net Application Server on Mon May 21 09:56:13 BST 2018