Monday, August 19, 2013

Oracle RAC with OCR mirrored by ASM pitfalls

Since 11gR2 OCR and Voting files can placed into ASM. Here an overview:
ASM Reduncancy Level OCR Mirrors Votingdisks Failgroups Min. count of disks
EXTERNAL1111
NORMAL2333
HIGH3555
The most common configuration of RAC is a 2 node RAC. For example Oracle database applicance.
If a RAC has one storage the will be no problem and ASM mirror is not nessecary, EXTERNAL redundancy can be used. If the RAC has two storages there is a problem with the Votingdisks. Therefore a third location will be needed. In most configurations two locations are expensive enough and a third location is not available.
Here a real situation:
Customer has two nodes and two storages. All files are mirrored on ASM with normal redundancy, also OCR diskgroup:
Site A: 2x voting, secondary RAC node
Site B: 1x voting, master RAC node
Suddenly site A will break down due to a site desaster. Some seconds later the RAC node on Site B will shutdown due to OCR errors.
Why does this happens:
ASM mirroring is done at block/extent level.
  • EXTERNAL mirroring does mean no mirror
  • NORMAL = extent will be located in one other failgroup
  • HIGH = extent will be located in two other failgroups
Build up on the count of disks needed and the mirror copies witch should be used?
  1. EXTERNAL = not usable for two storages
  2. NORMAL = 3 Disks with 3 voting disks and an OCR mirror
    therefore segmentation of disks is 2:1, but only 2 mirrored blocks of OCR maybe all OCR blocks are on Site A
  3. HIGH = 5 Disks with 5 voting disks and an HIGH OCR mirror
    therefore segmentation of disks is 3:2, but only 3 mirrored blocks of OCR maybe all OCR blocks are on Site A
So what can be done. The solution: NORMAL redundancy with HIGH redundancy OCR mirror. The following construct will be created:
Disk segmentation is 2:1 and 3 mirrored block of OCR. All blocks of the OCR will be mirrored on every Disk. Whatever witch Site on the disaster will happen, at least one OCR mirror copy will be available.
Here the demonstration on 12.1.0.1 GI:
1. Create Cluster with normal redundancy cluster diskgroup DG_CLUSTER
2. Check asm template of OCR asm diskgroup:
SQL> select * from v$asm_template where group_number=1;

GROUP_NUMBER ENTRY_NUMBER REDUND STRIPE S NAME PRIM MIRR CON_ID
------------ ------------ ------ ------ - ------------------------------ ---- ---- ----------
1 123 MIRROR COARSE Y VOTINGFILE COLD COLD 0
1 343 MIRROR COARSE Y OCRFILE COLD COLD 0
3. Check mirror on OCR:
ASMCMD> ls -l +DG_CLUSTER/vmsvr-clu2/OCRFILE
Type Redund Striped Time Sys Name
OCRFILE MIRROR COARSE JUL 23 23:00:00 Y REGISTRY.255.821572803
4. Check ASM extent distribution
SQL>select g.name
2 ,d.path
3 ,e.XNUM_KFFXP extent
4 ,decode(e.lxn_kffxp,0,'primary',1,'mirror-normal','mirror-high') mirrormeta
5 from x$kffxp e
6 ,v$asm_alias a
7 ,v$asm_disk d
8 ,v$asm_diskgroup g
9 where e.number_kffxp=a.file_number
10 and e.disk_kffxp=d.disk_number
11 and d.group_number = g.group_number
12 and a.name='REGISTRY.255.821572803'
13* order by 3,4 desc

NAME PATH EXTENT MIRRORMETA
---------------------------------------------------------------------- -------------
DG_CLUSTER ORCL:ORA_DISK_2 0 primary
DG_CLUSTER ORCL:ORA_DISK_1 0 mirror-normal
DG_CLUSTER ORCL:ORA_DISK_3 1 primary
DG_CLUSTER ORCL:ORA_DISK_1 1 mirror-normal
DG_CLUSTER ORCL:ORA_DISK_1 2 primary
DG_CLUSTER ORCL:ORA_DISK_3 2 mirror-normal
DG_CLUSTER ORCL:ORA_DISK_2 3 primary
DG_CLUSTER ORCL:ORA_DISK_3 3 mirror-normal
DG_CLUSTER ORCL:ORA_DISK_3 4 primary
DG_CLUSTER ORCL:ORA_DISK_2 4 mirror-normal
DG_CLUSTER ORCL:ORA_DISK_1 5 primary
DG_CLUSTER ORCL:ORA_DISK_2 5 mirror-normal
DG_CLUSTER ORCL:ORA_DISK_2 6 primary
DG_CLUSTER ORCL:ORA_DISK_1 6 mirror-normal
DG_CLUSTER ORCL:ORA_DISK_3 7 primary
DG_CLUSTER ORCL:ORA_DISK_1 7 mirror-normal
DG_CLUSTER ORCL:ORA_DISK_1 8 primary
DG_CLUSTER ORCL:ORA_DISK_3 8 mirror-normal
DG_CLUSTER ORCL:ORA_DISK_2 9 primary
DG_CLUSTER ORCL:ORA_DISK_3 9 mirror-normal
DG_CLUSTER ORCL:ORA_DISK_3 10 primary
DG_CLUSTER ORCL:ORA_DISK_2 10 mirror-normal
...
As you can see the diskgroup is made up of 3 disks (ORA_DISK_1 – 3). Further there are only two mirrors of each extent.
5. Backup OCR
[root ~]# ocrconfig -manualbackup
vmsvredu3 2013/07/24 23:51:17 /opt/oracle/12.1/grid/cdata/vmsvr-clu2/backup_20130724_235117.ocr
vmsvredu3 2013/07/23 22:52:10 /opt/oracle/12.1/grid/cdata/vmsvr-clu2/backup_20130723_225210.ocr
vmsvredu3 2013/07/23 22:45:11 /opt/oracle/12.1/grid/cdata/vmsvr-clu2/backup_20130723_224511.ocr
6. To correct this problem if don’t have an OCR mirror, stop cluster and start one node exclusiv and without crsd
crsctl start crs -excl -nocrs
7. Change asm template
SQL> alter diskgroup dg_cluster modify template OCRFILE attributes (HIGH);

Diskgroup altered.

SQL> select * from v$asm_template where group_number=1;

GROUP_NUMBER ENTRY_NUMBER REDUND STRIPE S NAME PRIM MIRR CON_ID
------------ ------------ ------ ------ - ------------------------------ ---- ---- ----------
1 120 MIRROR COARSE Y PARAMETERFILE COLD COLD 0
1 121 MIRROR COARSE Y ASMPARAMETERFILE COLD COLD 0
1 123 MIRROR COARSE Y VOTINGFILE COLD COLD 0
1 124 MIRROR COARSE Y DUMPSET COLD COLD 0
1 125 HIGH FINE Y CONTROLFILE COLD COLD 0
1 126 MIRROR COARSE Y FLASHFILE COLD COLD 0
1 127 MIRROR COARSE Y ARCHIVELOG COLD COLD 0
1 128 MIRROR COARSE Y ONLINELOG COLD COLD 0
1 129 MIRROR COARSE Y DATAFILE COLD COLD 0
1 230 MIRROR COARSE Y TEMPFILE COLD COLD 0
1 231 MIRROR COARSE Y BACKUPSET COLD COLD 0
1 232 MIRROR COARSE Y XTRANSPORT BACKUPSET COLD COLD 0
1 233 MIRROR COARSE Y INCR XTRANSPORT BACKUPSET COLD COLD 0
1 234 MIRROR COARSE Y AUTOBACKUP COLD COLD 0
1 235 MIRROR COARSE Y XTRANSPORT COLD COLD 0
1 237 MIRROR COARSE Y CHANGETRACKING COLD COLD 0
1 238 MIRROR COARSE Y FLASHBACK COLD COLD 0
1 239 MIRROR COARSE Y KEY_STORE COLD COLD 0
1 340 MIRROR COARSE Y AUTOLOGIN_KEY_STORE COLD COLD 0
1 341 MIRROR COARSE Y AUDIT_SPILLFILES COLD COLD 0
1 342 MIRROR COARSE Y DATAGUARDCONFIG COLD COLD 0
1 343 HIGH COARSE Y OCRFILE COLD COLD 0

22 rows selected.

SQL>
8. Remove old OCR
ASMCMD> ls -l
Type Redund Striped Time Sys Name
OCRFILE MIRROR COARSE JUL 24 10:00:00 Y REGISTRY.255.821572803
ASMCMD> rm -f REGISTRY.255.821572803
9. Restore OCR
[root ~]# ocrconfig -restore /opt/oracle/12.1/grid/cdata/vmsvr-clu2/backup_20130724_235117.ocr
10. Check new OCR
ASMCMD> ls -l
Type Redund Striped Time Sys Name
OCRFILE HIGH COARSE JUL 24 10:00:00 Y REGISTRY.255.821615711
ASMCMD>
11. Check crsd starts
[oracle ~]$ crsctl start res ora.crsd -init
CRS-2672: Attempting to start 'ora.crf' on 'vmsvredu3'
CRS-2672: Attempting to start 'ora.storage' on 'vmsvredu3'
CRS-2676: Start of 'ora.storage' on 'vmsvredu3' succeeded
CRS-2676: Start of 'ora.crf' on 'vmsvredu3' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'vmsvredu3'
CRS-2676: Start of 'ora.crsd' on 'vmsvredu3' succeeded
12. Restart cluster normal
[root ~]# crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'vmsvredu3'
CRS-2673: Attempting to stop 'ora.crsd' on 'vmsvredu3'
CRS-2677: Stop of 'ora.crsd' on 'vmsvredu3' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'vmsvredu3'
CRS-2673: Attempting to stop 'ora.evmd' on 'vmsvredu3'
CRS-2673: Attempting to stop 'ora.storage' on 'vmsvredu3'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'vmsvredu3'
CRS-2673: Attempting to stop 'ora.gpnpd' on 'vmsvredu3'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'vmsvredu3'
CRS-2677: Stop of 'ora.storage' on 'vmsvredu3' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'vmsvredu3'
CRS-2677: Stop of 'ora.drivers.acfs' on 'vmsvredu3' succeeded
CRS-2677: Stop of 'ora.gpnpd' on 'vmsvredu3' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'vmsvredu3' succeeded
CRS-2677: Stop of 'ora.evmd' on 'vmsvredu3' succeeded
CRS-2677: Stop of 'ora.asm' on 'vmsvredu3' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'vmsvredu3'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'vmsvredu3' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'vmsvredu3' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'vmsvredu3'
CRS-2677: Stop of 'ora.cssd' on 'vmsvredu3' succeeded
CRS-2673: Attempting to stop 'ora.crf' on 'vmsvredu3'
CRS-2677: Stop of 'ora.crf' on 'vmsvredu3' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'vmsvredu3'
CRS-2677: Stop of 'ora.gipcd' on 'vmsvredu3' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'vmsvredu3' has completed
CRS-4133: Oracle High Availability Services has been stopped.

[root@vmsvredu3 ~]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
[root ~]#
13. Check ASM mirroring again:
SQL> select g.name
2 ,d.path
3 ,e.XNUM_KFFXP extent
4 ,decode(e.lxn_kffxp,0,'primary',1,'mirror-normal','mirror-high') mirrormeta
5 from x$kffxp e
6 ,v$asm_alias a
7 ,v$asm_disk d
8 ,v$asm_diskgroup g
9 where e.number_kffxp=a.file_number
10 and e.disk_kffxp=d.disk_number
11 and d.group_number = g.group_number
12 and a.name='REGISTRY.255.821615711'
13 order by 3,4 desc
14 ;

NAME PATH EXTENT MIRRORMETA
------------------------------ ------------------------------ ---------- -------------
DG_CLUSTER ORCL:ORA_DISK_1 0 primary
DG_CLUSTER ORCL:ORA_DISK_3 0 mirror-normal
DG_CLUSTER ORCL:ORA_DISK_2 0 mirror-high
DG_CLUSTER ORCL:ORA_DISK_2 1 primary
DG_CLUSTER ORCL:ORA_DISK_3 1 mirror-normal
DG_CLUSTER ORCL:ORA_DISK_1 1 mirror-high
DG_CLUSTER ORCL:ORA_DISK_3 2 primary
DG_CLUSTER ORCL:ORA_DISK_1 2 mirror-normal
DG_CLUSTER ORCL:ORA_DISK_2 2 mirror-high
DG_CLUSTER ORCL:ORA_DISK_1 3 primary
DG_CLUSTER ORCL:ORA_DISK_3 3 mirror-normal
DG_CLUSTER ORCL:ORA_DISK_2 3 mirror-high
DG_CLUSTER ORCL:ORA_DISK_2 4 primary
DG_CLUSTER ORCL:ORA_DISK_3 4 mirror-normal
DG_CLUSTER ORCL:ORA_DISK_1 4 mirror-high
DG_CLUSTER ORCL:ORA_DISK_3 5 primary
DG_CLUSTER ORCL:ORA_DISK_1 5 mirror-normal
DG_CLUSTER ORCL:ORA_DISK_2 5 mirror-high
DG_CLUSTER ORCL:ORA_DISK_1 6 primary
DG_CLUSTER ORCL:ORA_DISK_3 6 mirror-normal
DG_CLUSTER ORCL:ORA_DISK_2 6 mirror-high
...
All done. Now a disaster can come.
References:

No comments: