Monday, July 1, 2019

4K Sector, compatible.rdbms and Redo Log File Block Size

Disks these days comes in 512 or 4K sector sizes. Oracle DB is aware of the 4K sector size and supports it(including on ASM). The 4K sector size mainly affect the redo logs, standby redo log and archive logs. On DBs that are on Linux redo logs have a block size of 512 and could be affected negatively running on a 4K sector disk unless some additional work is done. 4K sector disks come in two types, native and emulated. In emulated mode (denoted as 512e), a 4K physical sector is presented as eight logical 512 sectors. On the native mode a one physical 4K sector is presented as a one logical 4k sector. The MOS notes mentioned at the end of the post has more information on these.
This post show how setting the compatible.rdbms could affect a the default block size of the newly created redo logs. The database and GI (for ASM) used for this was 19.3. OS is OEL 7.4 The setup was created on Azure.
For this setup, two separate disks were attached to the Azure VM to be used in two ASM groups. These two ASM groups will be used to host each member of a two member redo log group. Azure premium SSD disks are in emulation mode. This could be verified by looking at the logical and physical sector information.
# fdisk -l /dev/sde

Disk /dev/sde: 4294 MB, 4294967296 bytes, 8388608 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

# fdisk -l /dev/sdg

Disk /dev/sdg: 4294 MB, 4294967296 bytes, 8388608 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Another way to get the same information is to query the virtual file system.
# cat  /sys/block/sde/queue/physical_block_size
4096
# cat  /sys/block/sde/queue/logical_block_size
512

# cat  /sys/block/sdg/queue/physical_block_size
4096
# cat  /sys/block/sdg/queue/logical_block_size
512
A single partition was created on each disk taking in the whole disk. When using 4K sectors with ASM it's important to align the partition to avoid any additional IO due to misalignment. Follow MOS 1523947.1 for more on disk alignment. The start of the partition is aligned with the start of the sector by specifying the start to occur 1MB from the beginning of the disk.
parted /dev/sde
GNU Parted 3.1
Using /dev/sde
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mktable gpt
(parted) u
Unit?  [compact]? mib
(parted) p
Model: Msft Virtual Disk (scsi)
Disk /dev/sde: 4096MiB <====
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:

Number  Start  End  Size  File system  Name  Flags

(parted) mkpart redo1 1 4095 (above value - 1)

(parted) p
Model: Msft Virtual Disk (scsi)
Disk /dev/sde: 4096MiB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:

Number  Start    End      Size     File system  Name   Flags
 1      1.00MiB  4095MiB  4094MiB               redo1
Same partition viewed in sectors shows
(parted) u
Unit?  [MiB]? s
(parted) p
Model: Msft Virtual Disk (scsi)
Disk /dev/sde: 8388608s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:

Number  Start  End       Size      File system  Name   Flags
 1      2048s  8386559s  8384512s               redo1
Similarly the other disk is also partitioned
(parted) p
Model: Msft Virtual Disk (scsi)
Disk /dev/sdg: 8388608s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:

Number  Start  End       Size      File system  Name   Flags
 1      2048s  8386559s  8384512s               redo2
No ASMLib was used for creating the ASM disks. Instead udev rules were created for the partitions. Following that the ASM diskgroups were created with default options (no explicit sector sizes were set).
create diskgroup redo1 external redundancy disk '/dev/oracleasm/cgredo1' attribute 'au_size'='4m';
create diskgroup redo2 external redundancy disk '/dev/oracleasm/cgredo2' attribute 'au_size'='4m';
Each ASM disk group is then assigned online log destination 1 and 2.
SQL> alter system set db_create_online_log_dest_1='+redo1' scope=both;
SQL> alter system set db_create_online_log_dest_2='+redo2' scope=both;
The compatible and sector size attributes for each of the disk groups is shown below. What's interesting here is that the sector size is set to 512. The actual physical sector of 4K is not detected.
SQL>  select (select name from v$asm_diskgroup dg where dg.group_number=a.group_number) as dg_name,name,value from v$asm_attribute a where  group_number in (3,4) and (name like '%sector%' or name like '%compat%');

DG_NAME                        NAME                           VALUE
------------------------------ ------------------------------ --------------
REDO1                          sector_size                    512
REDO1                          compatible.asm                 11.2.0.2.0
REDO1                          compatible.rdbms               10.1.0.0.0
REDO1                          logical_sector_size            512
REDO2                          sector_size                    512
REDO2                          compatible.asm                 11.2.0.2.0
REDO2                          compatible.rdbms               10.1.0.0.0
REDO2                          logical_sector_size            512
At this stage when redo group is created the block size of the redo logs is 512.
GROUP# BLOCKSIZE 
---------- ----------
         1 512
Changing the compatible.asm to 19.
alter diskgroup redo1 set attribute 'compatible.asm'='19.0.0.0.0';
alter diskgroup redo2 set attribute 'compatible.asm'='19.0.0.0.0';

DG_NAME                        NAME                           VALUE
------------------------------ ------------------------------ --------------
REDO1                          sector_size                    512
REDO1                          compatible.asm                 19.0.0.0.0
REDO1                          compatible.rdbms               10.1.0.0.0
REDO1                          logical_sector_size            512
REDO2                          sector_size                    512
REDO2                          compatible.asm                 19.0.0.0.0
REDO2                          compatible.rdbms               10.1.0.0.0
REDO2                          logical_sector_size            512
New redo logs created would still have 512 as block size. There's no warning with regard to log switches of existing groups.

Next the sector size is set to 4096, which is the physical sector size. Also the logical sector size is set explicitly to 512.
alter diskgroup redo1 set attribute 'sector_size'='4096';
alter diskgroup redo2 set attribute 'sector_size'='4096';

alter diskgroup redo1 set attribute 'logical_sector_size'='512';
alter diskgroup redo2 set attribute 'logical_sector_size'='512';
 
DG_NAME                        NAME                           VALUE
------------------------------ ------------------------------ -------------
REDO1                          sector_size                    4096
REDO1                          compatible.asm                 19.0.0.0.0
REDO1                          compatible.rdbms               10.1.0.0.0
REDO1                          logical_sector_size            512
REDO2                          sector_size                    4096
REDO2                          compatible.asm                 19.0.0.0.0
REDO2                          compatible.rdbms               10.1.0.0.0
REDO2                          logical_sector_size            512
Similar to previous times, new redo logs created would still have 512 as block size. There's no warning with regard to log switches of existing groups.



Finally the compatible.rdbms is set to 19.
alter diskgroup redo1 set attribute 'compatible.rdbms'='19.0.0.0.0';
alter diskgroup redo2 set attribute 'compatible.rdbms'='19.0.0.0.0';

DG_NAME                        NAME                           VALUE
------------------------------ ------------------------------ ---------------
REDO1                          sector_size                    4096
REDO1                          compatible.asm                 19.0.0.0.0
REDO1                          compatible.rdbms               19.0.0.0.0
REDO1                          logical_sector_size            512
REDO2                          sector_size                    4096
REDO2                          compatible.asm                 19.0.0.0.0
REDO2                          compatible.rdbms               19.0.0.0.0
REDO2                          logical_sector_size            512
At this stage when a new redo log group is added the block size defaults to 4096.
SQL> alter database add logfile group 3;

SQL> select * from v$log;

    GROUP#    THREAD#  SEQUENCE#      BYTES  BLOCKSIZE    MEMBERS ARC STATUS           FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME     CON_ID
---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- --------- ------------ --------- ----------
         1          1        102  104857600        512          1 NO  CURRENT                 902579 30-MAY-19   1.8447E+19                    0
         2          1        101  104857600        512          1 NO  INACTIVE                902576 30-MAY-19       902579 30-MAY-19          0
         3          1          0  104857600       4096          1 YES UNUSED                       0                      0                    0
If a database was created after changing the ASM attribute (compat.rbdms to 19) then the newly created database's redo logs will ahve a block size of 4096.
Since the disks are in emulated mode, it's still possible to create redo groups with 512 block size.
alter database add logfile group 4 blocksize 512;

Database altered.
However, this will result in following warning shown on the alert log when log switching happens.
2019-05-30T14:59:59.922757+00:00
WARNING! THE SPECIFIED LOGFILE BLOCKSIZE (512) IS LESS THAN THE MEDIA SECTOR SIZE ON DISK (4096). LOGFILE WRITES ARE NOT SECTOR ALIGNED. THIS MAY LEAD TO DEGRADED PERFORMANCE DUE TO READ-MODIFY-WRITE OPERATIONS.
2019-05-30T15:00:04.977357+00:00
WARNING! THE SPECIFIED LOGFILE BLOCKSIZE (512) IS LESS THAN THE MEDIA SECTOR SIZE ON DISK (4096). LOGFILE WRITES ARE NOT SECTOR ALIGNED. THIS MAY LEAD TO DEGRADED PERFORMANCE DUE TO READ-MODIFY-WRITE OPERATIONS.
2019-05-30T15:00:10.072368+00:00
WARNING! THE LOGICAL BLOCKSIZE (512) OF LOGFILE MEMBER +REDO1/FCTEST/ONLINELOG/group_4.257.1009637999 IS LESS THAN THE PHYSICAL SECTOR SIZE (4096). LOGFILE WRITES ARE NOT SECTOR ALIGNED. THIS MAY LEAD TO DEGRADED PERFORMANCE DUE TO THE READ-MODIFY-WRITE OPERATIONS.
WARNING! THE LOGICAL BLOCKSIZE (512) OF LOGFILE MEMBER +REDO2/FCTEST/ONLINELOG/group_4.257.1009638005 IS LESS THAN THE PHYSICAL SECTOR SIZE (4096). LOGFILE WRITES ARE NOT SECTOR ALIGNED. THIS MAY LEAD TO DEGRADED PERFORMANCE DUE TO THE READ-MODIFY-WRITE OPERATIONS.


An interesting behaviour was observed during this changing of the compat.rdbms attribute. If only one of the ASM disk group has this attribute changed, then the outcome changes based on if it is the first or the second online destination. For example if it's only the first online destination's ASM diskgroup that has compat.rdbms changed as below
alter diskgroup redo1 set attribute 'compatible.rdbms'='19.0.0.0.0';

DG_NAME                        NAME                           VALUE
------------------------------ ------------------------------ --------------
REDO1                          sector_size                    4096
REDO1                          compatible.asm                 19.0.0.0.0
REDO1                          compatible.rdbms               19.0.0.0.0
REDO1                          logical_sector_size            512
REDO2                          sector_size                    4096
REDO2                          compatible.asm                 19.0.0.0.0
REDO2                          compatible.rdbms               10.1.0.0.0
REDO2                          logical_sector_size            512
then any attempt to create online redo log groups would fail as below (no issue for the existing redo logs).
SQL> alter database add logfile group 3;
alter database add logfile group 3
*
ERROR at line 1:
ORA-01378: The logical block size (4096) of file +redo2 is not compatible with the disk sector size (media sector size is 512 and host sector size is 512)
On the other hand if it's only the second online log destination that was changed (instead of both) as shown below
alter diskgroup redo2 set attribute 'compatible.rdbms'='19.0.0.0.0';

DG_NAME                        NAME                           VALUE
------------------------------ ------------------------------ ----------------
REDO1                          sector_size                    4096
REDO1                          compatible.asm                 19.0.0.0.0
REDO1                          compatible.rdbms               10.1.0.0.0
REDO1                          logical_sector_size            512
REDO2                          sector_size                    4096
REDO2                          compatible.asm                 19.0.0.0.0
REDO2                          compatible.rdbms               19.0.0.0.0
REDO2                          logical_sector_size            512
then newly created redo logs will have block size of 512. But during log switching warning is shown for the log file members who are in redo2 group.
SQL> alter database add logfile group 3;

Database altered.

SQL> select * from v$log;

    GROUP#    THREAD#  SEQUENCE#      BYTES  BLOCKSIZE    MEMBERS ARC STATUS           FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME     CON_ID
---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- --------- ------------ --------- ----------
         1          1        156  104857600        512          2 NO  CURRENT                 903992 30-MAY-19   1.8447E+19                    0
         2          1          0  104857600        512          2 YES UNUSED                       0                      0                    0
         3          1          0  104857600        512          2 YES UNUSED                       0                      0                    0


2019-05-30T14:56:21.282852+00:00
WARNING! THE LOGICAL BLOCKSIZE (512) OF LOGFILE MEMBER +REDO2/FCTEST/ONLINELOG/group_3.256.1009637741 IS LESS THAN THE PHYSICAL SECTOR SIZE (4096). LOGFILE WRITES ARE NOT SECTOR ALIGNED. THIS MAY LEAD TO DEGRADED PERFORMANCE DUE TO THE   READ-MODIFY-WRITE OPERATIONS.
WARNING! THE LOGICAL BLOCKSIZE (512) OF LOGFILE MEMBER +REDO2/FCTEST/ONLINELOG/group_3.256.1009637741 IS LESS THAN THE PHYSICAL SECTOR SIZE (4096). LOGFILE WRITES ARE NOT SECTOR ALIGNED. THIS MAY LEAD TO DEGRADED PERFORMANCE DUE TO THE   READ-MODIFY-WRITE OPERATIONS.


Useful metalink notes
Supporting ASM on 4K/4096 Sector Size (SECTOR_SIZE) Disks [ID 1630790.1]
Alert: After SAN Firmware Upgrade, ASM Diskgroups ( Using ASMLIB) Cannot Be Mounted Due To ORA-15085: ASM disk "" has inconsistent sector size. [ID 1500460.1]
4096 Physical Block Size Support in Oracle Linux [ID 1582530.1]
Supporting 4K Sector Disks [ID 1133713.1]
ASM Diskgroups Cannot Be Created On RedHat6 Using 4K(4096) SECTOR_SIZE Non-ASMLIB Disks [ID 1961171.1]
Using 4k Redo Logs on Flash, 4k-Disk and SSD-based Storage [ID 1681266.1]