2023年11月26日 星期日

從 NFS restore control file 失敗

Oracle 版本: 19.15

OS 版本: AIX 7.2


問題描述:

從 NFS 執行 restore controlfile 發生 RMAN-06172 錯誤 :

RMAN> restore controlfile from '/backup/rman/ORCL_0h1d5s04_1_1.ctl';


Starting restore at 19-NOV-2022 00:32:30

using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=600 device type=DISK


RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03002: failure of restore command at 11/19/2022 00:32:31

RMAN-06172: no AUTOBACKUP found or specified handle is not a valid copy or piece


問題分析:

將備份檔 ORCL_0h1d5s04_1_1.ctl 複製到 local directory 並嘗試 restore controlfile 是可以成功 :

RMAN> restore controlfile from '/u01/app/backup/ORCL_0h1d5s04_1_1.ctl';


Starting restore at 19-NOV-2022 00:50:21

using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=120 device type=DISK


channel ORA_DISK_1: restoring control file

channel ORA_DISK_1: restore complete, elapsed time: 00:00:02

output filename=/u01/oradata/orcl/control01.ctl

output filename=/u01/oradata/orcl/control02.ctl

Finished restore at 19-NOV-2022


由此驗證問題是出在 NFS 的 mount option 上,檢查發現 NFS 沒有調整任何的 mount option :

# mount |grep backup

172.26.18.110 /backup/rman   /backup  nfs3


解決方法:

使用正確的 mount option 重新 mount NFS 即可 :

# mount 172.26.18.110:/backup /backup/rman –o cio,rw,bg,hard,nointr,rsize=32768,wsize=32768,proto=tcp,noac,vers=3,timeo=600


如果是 Linux 系統,則使用以下 options :

rw,bg,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,vers=3,timeo=600



RMAN 產生 ENQ: CF - CONTENTION 等待案例

Oracle 版本: 19.14

OS 版本: Linux 7.9


問題描述:

DB 在 RMAN 執行備份時段發生 hang 的情況,約有 2 分鐘無法進行任何操作與連線,此時 RMAN 正好在執行 rman delete obsolete 。


問題分析:

由 ASH 分析在事件時段存在著 ENQ: CF – CONTENTION 與 CONTROL FILE PARALLEL WRITE 等待事件 :


這個現象與 Bug 33727983 所描述的情況相符,原因在於需要 delete obsolete 的物件過多,需要花費較長的時間掃描 control file ,在多個 channel 同時作用下,便產生了 control file 爭用的情況發生。由 RMAN 的 log 可以看到 delete 超過萬個 obsolete 物件 :


解決方法:

Bug 33727983 於 19.17 之後的版本修復,將 DB apply RU 至 19.17 之後的版本,或者是單獨 apply 33727983 的 one-off patch 。暫時的 workaround 可以只使用單一個 channel 執行 delete obsolete ,這樣就可以避免多個 channel 執行同一個操作而造成 control file 爭用。



2023年10月23日 星期一

datapatch 產生 ORA-29913 錯誤

Oracle 版本: 12.2.0.1.181016

OS 版本: Solaris 10


問題描述:

將資料庫由 12.2.0.1.181016 patch 至 12.2.0.1.200714 ,在最後階段執行 datapatch 時失敗產生了 ORA-29913 錯誤 :

-bash-3.2$ ./datapatch -verbose

SQL Patching tool version 12.2.0.1.0 Production on Thu Sep 17 18:59:01 2020

Copyright (c) 2012, 2020, Oracle.  All rights reserved.


Log file for this invocation: /oracle/app/cfgtoollogs/sqlpatch/sqlpatch_23713_2020_09_17_18_59_01

/sqlpatch_invocation.log


Connecting to database...OK

Bootstrapping registry and package to current versions...done


DBD::Oracle::db selectrow_array failed: ORA-29913: error in executing 

ODCIEXTTABLEFETCH callout

ORA-00607: Internal error occurred while making a change to a data block

ORA-00600: internal error code, arguments: [kdBlkCheckError], [3], [50181], [63053], [], [], [], [], [], [], [], [] (DBD ERROR: error possibly near <*> indicator at char 7 in 'SELECT <*> dbms_sqlpatch.verify_queryable_inventory FROM dual') [for Statement "SELECT dbms_sqlpatch.verify_queryable_inventory FROM dual"] at /oracle/app/product/12.2.0/sqlpatch/sqlpatch.pm line 6027, <LOGFILE> line 107.



問題分析:

由錯誤訊息中可知 datapatch 過程中在執行 dbms_sqlpatch.verify_queryable_inventory 時發生了錯誤,此時以 sqlplus 進行測試也同樣發生了 ORA-29913 錯誤 :

SQL> SELECT dbms_sqlpatch.verify_queryable_inventory

FROM dual  2  ;

SELECT dbms_sqlpatch.verify_queryable_inventory

       *

ERROR at line 1:

ORA-29913: error in executing ODCIEXTTABLEFETCH callout

ORA-00607: Internal error occurred while making a change to a data block

ORA-00600: internal error code, arguments: [kdBlkCheckError], [3], [50181],

[63053], [], [], [], [], [], [], [], []


由 trace file 發現 LOB Segments 有 Corrupt Block 所以才造成錯誤 :

Corrupt Block Found

TIME STAMP (GMT) = 09/17/2020 19:02:17

CONT = 0, TSN = 1, TSNAME = SYSAUX

RFN = 3, BLK = 50181, RDBA = 12633093

OBJN = 10486, OBJD = 10486, OBJECT = SYS_LOB0000010485C00001$$, SUBOBJECT =

SEGMENT OWNER = SYS, SEGMENT TYPE = Lob Segment

Errors in file /oracle/app/diag/rdbms/orcl/orcl/trace/orcl_ora_24528.trc  (incident=1080037):

ORA-00600: internal error code, arguments: [kdBlkCheckError], [3], [50181], [63053], [], [], [], [], [],[], [], []

Incident details in: /oracle/app/diag/rdbms/orcl/orcl/incident/incdir_1080037

/orcl_ora_24528_i1080037.trc


查詢這個 LOB Segment ,發現它是屬於 OPATCH_XINV_TAB 這個 table :

SQL> select table_name,column_name,TABLESPACE_NAME from dba_lobs where owner='SYS' and segment_name='SYS_LOB0000010485C00001$$';


TABLE_NAME              COLUMN_NAME      TABLESPACE_NAME

-----------------------------------------------------------------

OPATCH_XINV_TAB     XML_INVENTORY    SYSAUX


經由分析可知是由於 OPATCH_XINV_TAB 有 corrupt block 造成 datapatch 執行失敗。


解決方法:

將此 LOB Segments 進行一次搬移,避開這個 corrupt block :

SQL> alter table sys.opatch_xinv_tab move lob(xml_inventory) store as (tablespace sysaux);


Table altered.


Move 之後確認物件狀態為 Valid :

SQL> select owner,object_name,object_type,status from dba_objects where object_name in ('DBMS_QOPATCH','OPATCH_XML_INV');


OWNER         OBJECT_NAME         OBJECT_TYPE           STATUS

----------- ------------------ ------------------- -------

SYS            DBMS_QOPATCH       PACKAGE                 VALID

SYS            DBMS_QOPATCH       PACKAGE BODY           VALID

SYS            OPATCH_XML_INV     TABLE                   VALID


重新測試 dbms_sqlpatch.verify_queryable_inventory 就沒有發生錯誤了 :

SQL> select dbms_sqlpatch.verify_queryable_inventory from dual;


VERIFY_QUERYABLE_INVENTORY

--------------------------------------------------

OK


此時再重新執行一次 datapatch 即可通過。