首页 / 数据库 / MySQL / ORA-00494 enqueue [CF] held for too long故障分析处理
今天接应用人员电话反应,一套备份数据库在加完表空间后出现异常,现象是数据库能查询,但是update很慢。故障处理详细: 1、查看alert日志如下: Thread 1 advanced to log sequence 16541 (LGWR switch) Current log# 1 seq# 16541 mem# 0: +DATA/racdb/onlinelog/group_1.262.792077131 Current log# 1 seq# 16541 mem# 1: +DATA/racdb/onlinelog/group_1.263.792077153 Fri Sep 12 09:52:51 2014 Archived Log entry 3480 added for thread 1 sequence 16540 ID 0x10fd7185 dest 1: Fri Sep 12 09:59:08 2014 alter tablespace ALARM_DATA_TBS add datafile "+DATA" size 10g autoextend off Fri Sep 12 10:01:11 2014 Completed: alter tablespace ALARM_DATA_TBS add datafile "+DATA" size 10g autoextend off Fri Sep 12 10:06:17 2014 alter tablespace ALARM_DATA_TBS add datafile "+DATA" size 10g autoextend off Fri Sep 12 10:13:44 2014 minact-scn: useg scan erroring out with error e:12751 Fri Sep 12 10:17:50 2014 Errors in file /Oracle/app/oracle/diag/rdbms/racdb/racdb/trace/racdb_m001_18487.trc: ORA-12751: cpu time or run time policy violation Fri Sep 12 10:22:46 2014 Errors in file /oracle/app/oracle/diag/rdbms/racdb/racdb/trace/racdb_ora_17431.trc (incident=362566): ORA-00494: 持有入队 [CF] 的时间过长 (超过 900 秒) (由 "inst 1, osid 18462") Incident details in: /oracle/app/oracle/diag/rdbms/racdb/racdb/incident/incdir_362566/racdb_ora_17431_i362566.trc Fri Sep 12 10:23:16 2014 Killing enqueue blocker (pid=18462) on resource CF-00000000-00000000 by (pid=17431) by killing session 2140.29719 Attempt to get Control File Enqueue by USER pid=17431 (mode=X, type=0, timeout=900) is being blocked by inst=1, pid=18462 Please check inst 1"s alert log for more information on the blocker including a possible ORA-00494 and related incident logs Fri Sep 12 10:23:57 2014 minact-scn: useg scan erroring out with error e:12751 Fri Sep 12 10:23:57 2014 Sweep [inc][362566]: completed Sweep [inc2][362566]: completed Fri Sep 12 10:25:10 2014 Errors in file /oracle/app/oracle/diag/rdbms/racdb/racdb/trace/racdb_lgwr_9336.trc (incident=360259): ORA-00494: enqueue [CF] held for too long (more than 900 seconds) by "inst 1, osid 18462" Incident details in: /oracle/app/oracle/diag/rdbms/racdb/racdb/incident/incdir_360259/racdb_lgwr_9336_i360259.trc Fri Sep 12 10:25:40 2014 Killing enqueue blocker (pid=18462) on resource CF-00000000-00000000 by (pid=9336) by killing session 2140.29719 Attempt to get Control File Enqueue by LGWR pid=9336 (mode=X, type=0, timeout=900) is being blocked by inst=1, pid=18462 Please check inst 1"s alert log for more information on the blocker including a possible ORA-00494 and related incident logs Fri Sep 12 10:28:16 2014 Errors in file /oracle/app/oracle/diag/rdbms/racdb/racdb/trace/racdb_ora_17431.trc (incident=362567): ORA-00494: 持有入队 [CF] 的时间过长 (超过 900 秒) (由 "inst 1, osid 18462") Incident details in: /oracle/app/oracle/diag/rdbms/racdb/racdb/incident/incdir_362567/racdb_ora_17431_i362567.trc Fri Sep 12 10:28:46 2014 Killing enqueue blocker (pid=18462) on resource CF-00000000-00000000 by (pid=17431) by terminating the process Attempt to get Control File Enqueue by USER pid=17431 (mode=X, type=0, timeout=300) is being blocked by inst=1, pid=18462 Please check inst 1"s alert log for more information on the blocker including a possible ORA-00494 and related incident logs Fri Sep 12 10:28:46 2014 Thread 1 advanced to log sequence 16542 (LGWR switch) Current log# 2 seq# 16542 mem# 0: +DATA/racdb/onlinelog/group_2.264.792077173 Current log# 2 seq# 16542 mem# 1: +DATA/racdb/onlinelog/group_2.265.792077193 2、进入数据库中手动切换日志(alter system switch logfile)很慢几近于不动。 3、分析日志发现LGWR进程一直在等待CF enqueue, Dump continued from file: /oracle/app/oracle/diag/rdbms/racdb/racdb/trace/racdb_lgwr_5854.trc ORA-00494: enqueue [CF] held for too long (more than 900 seconds) by "inst 1, osid 7955" ========= Dump for incident 380258 (ORA 494) ======== ----- Beginning of Customized Incident Dump(s) ----- ------------------------------------------------------------------------------- ENQUEUE [CF] HELD FOR TOO LONG enqueue holder: "inst 1, osid 7955"
4、进一步分析发现很多次control file sequential read等待,即处于IO等待状态, 为什么出现control file sequential read,可能控制文件不在了,或者控制文件或其快照所在目录掉了。 Current Wait Stack: Not in wait; last wait ended 17 min 35 sec ago There are 40 sessions blocked by this session. Dumping one waiter: inst: 1, sid: 1653, ser: 1 wait event: "enq: CF - contention" p1: "name|mode"=0x43460005 p2: "0"=0x0 p3: "operation"=0x0 row_wait_obj#: 4294967295, block#: 0, row#: 0, file# 0 min_blocked_time: 898 secs, waiter_cache_ver: 407 Wait State: fixed_waits=0 flags=0x21 boundary=0x0/-1 Session Wait History: elapsed time of 17 min 35 sec since last wait 0: waited for "control file sequential read" file#=0x0, block#=0x12, blocks=0x1 wait_id=10168 seq_num=10169 snap_id=1 wait times: snap=0.000352 sec, exc=0.000352 sec, total=0.000352 sec wait times: max=infinite wait counts: calls=0 os=0 occurred after 0.000018 sec of elapsed time 1: waited for "control file sequential read" file#=0x0, block#=0x10, blocks=0x1 wait_id=10167 seq_num=10168 snap_id=1 wait times: snap=0.000320 sec, exc=0.000320 sec, total=0.000320 sec wait times: max=infinite wait counts: calls=0 os=0 occurred after 0.000018 sec of elapsed time 5、进入RMAN中查看控制文件备份路径均为/oradata/racdbdb_rman_bak目录,详细如下: RMAN> show all; 使用目标数据库控制文件替代恢复目录 db_unique_name 为 racdb 的数据库的 RMAN 配置参数为: CONFIGURE RETENTION POLICY TO REDUNDANCY 1; CONFIGURE BACKUP OPTIMIZATION OFF; # default CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default CONFIGURE CONTROLFILE AUTOBACKUP OFF; CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO "/oradata/racdbdb_rman_bak/conf_%T_%F"; CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default CONFIGURE MAXSETSIZE TO UNLIMITED; # default CONFIGURE ENCRYPTION FOR DATABASE OFF; # default CONFIGURE ENCRYPTION ALGORITHM "AES128"; # default CONFIGURE COMPRESSION ALGORITHM "BASIC" AS OF RELEASE "DEFAULT" OPTIMIZE FOR LOAD TRUE ; # default CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default CONFIGURE SNAPSHOT CONTROLFILE NAME TO "/oradata/racdbdb_rman_bak/snap_racdb.ctl"; 6、查看目录发现/oradata目录掉了 bash-3.00$ df -h Filesystem size used avail capacity Mounted on /dev/dsk/c0t0d0s0 96G 47G 48G 50% / /devices 0K 0K 0K 0% /devices ctfs 0K 0K 0K 0% /system/contract proc 0K 0K 0K 0% /proc mnttab 0K 0K 0K 0% /etc/mnttab swap 87G 1.9M 87G 1% /etc/svc/volatile objfs 0K 0K 0K 0% /system/object sharefs 0K 0K 0K 0% /etc/dfs/sharetab fd 0K 0K 0K 0% /dev/fd swap 87G 120K 87G 1% /tmp swap 87G 80K 87G 1% /var/run /dev/dsk/c0t0d0s4 38G 27G 11G 72% /export/home /dev/dsk/c0t1d0s3 77G 41G 35G 54% /oracle 7、通过修改控制文件备份路径,数据库恢复正常。 RMAN> CONFIGURE SNAPSHOT CONTROLFILE NAME TO "/oracle/app/oracle/product/11.2.0/db_1/dbs/snap_racdb.ctl"; RMAN> show all; db_unique_name 为 racdb 的数据库的 RMAN 配置参数为: CONFIGURE RETENTION POLICY TO REDUNDANCY 1; CONFIGURE BACKUP OPTIMIZATION OFF; # default CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default CONFIGURE CONTROLFILE AUTOBACKUP ON; CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO "/oracle/app/oracle/product/11.2.0/db_1/dbs/conf_%T_%F"; CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default CONFIGURE MAXSETSIZE TO UNLIMITED; # default CONFIGURE ENCRYPTION FOR DATABASE OFF; # default CONFIGURE ENCRYPTION ALGORITHM "AES128"; # default CONFIGURE COMPRESSION ALGORITHM "BASIC" AS OF RELEASE "DEFAULT" OPTIMIZE FOR LOAD TRUE ; # default CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default CONFIGURE SNAPSHOT CONTROLFILE NAME TO "/oracle/app/oracle/product/11.2.0/db_1/dbs/snap_racdb.ctl";更多Oracle相关信息见Oracle 专题页面 http://www.linuxidc.com/topicnews.aspx?tid=12本文永久更新链接地址