昨天接备份同事电话反应在进行RMAN冷备的过程中报如下错,某个表空间备份失败。 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03009: failure of backup command on ch00 channel at 04/25/2015 22:02:30 ORA-19566: exceeded limit of 0 corrupt blocks for file +DATA/dbrac/datafile/tbs_11.11.435678937经过分析发现坏块导致,且坏块不属于任何对象(空块),以下是本次坏块修复步骤: 1、查找坏块 1)使用RMAN查找坏块 验证整个数据库: Rman> backup validate check logical database ; 注:当数据库版本低于11g且非归档模式,以上命令必须在数据库处于mounted状态执行 验证单个datafile Rman> backup validate check logical datafile 11 ; 而后执行以下SQL查看坏块: SQL>Select * from v$database_block_corruption ;例如: validate.sh #!/bin/bash source /home/Oracle/.bash_profile $ORACLE_HOME/bin/rman log=/home/oracle/users/validate.log <<EOF connect target / Backup validate check logical datafile 11 ; exit; EOF 2)使用DBV查找坏块: dbv userid=system/system file="+DATA/dbrac/datafile/tbs_11.11.435678937" blocksize=32768 2、确认坏块是否不属于任何对象 select segment_name, segment_type, owner from dba_extents where file_id = <Absolute file number> and <corrupted block number> between block_id and block_id + blocks -1; 例如: alter session force parallel query parallel 10; select segment_name, segment_type, owner from dba_extents where file_id = 11 and 184959440 between block_id and block_id + blocks -1;
3、确认块在 dba_free_space存在 Select * from dba_free_space where file_id= <Absolute file number> and <corrupted block number> between block_id and block_id + blocks -1; 例如: Select * from dba_free_space where file_id= 11 and 184959440 between block_id and block_id + blocks -1;
4、创建表 create table s ( n number, c varchar2(4000) ) nologging tablespace <tablespace name having the corrupt block> pctfree 99; 例如: create table users.s ( n number, c varchar2(4000) ) nologging tablespace TBS_11 pctfree 99; select segment_name,tablespace_name from dba_segments where segment_name="S" ;
Select table_name,tablespace_name from dba_tables where table_name="S" ;
5、创建触发器 CREATE OR REPLACE TRIGGER corrupt_trigger AFTER INSERT ON users.s REFERENCING OLD AS p_old NEW AS new_p FOR EACH ROW DECLARE corrupt EXCEPTION; BEGIN IF (dbms_rowid.rowid_block_number(:new_p.rowid)=&blocknumber) and (dbms_rowid.rowid_relative_fno(:new_p.rowid)=&filenumber) THEN RAISE corrupt; END IF; EXCEPTION WHEN corrupt THEN RAISE_APPLICATION_ERROR( -20000, "Corrupt block has been formatted"); END; / 6、分配空间创建在有坏块的datafile上的表 注: i)因为ASSM会自动确定下一个区段的大小,所以在ASSM的表空间上,需要创建多个表及 不断的分配空间给这些表,直到坏块被分配至其中一个对象。 ii)设置datafile的AUTOEXTEND为OFF 1)查找坏块的extent size Select BYTES from dba_free_space where file_id=<file no> and <corrupt block no> between block_id and block_id + blocks -1;例如: alter database datafile "+DATA/dbrac/datafile/tbs_11.11.435678937" autoextend off; SQL> Select BYTES from dba_free_space where file_id=11 and 184959440 between 2 block_id and block_id + blocks -1; BYTES ---------- 29360128 2)不断allocate直到坏块是S表的一部分 如果步骤1输出结果是64K,执行以下SQL: alter table users.s allocate extent (DATAFILE "+DATA/dbrac/datafile/tbs_11.11.435678937" SIZE 64K);如果大于64K使用以下 BEGIN for i in 1..1000000 loop EXECUTE IMMEDIATE "alter table users.s allocate extent (DATAFILE "||"""+DATA/dbrac/datafile/tbs_11.11.435678937"""||"SIZE 64K) "; end loop; end ; / 使用如下SQL查询坏块是否已属于某个对象: select segment_name, segment_type, owner from dba_extents where file_id = <Absolute file number> and <corrupt block number> between block_id and block_id + blocks -1 ; 例如: select segment_name, segment_type, owner from dba_extents where file_id = 11 and 184959440 between block_id and block_id + blocks -1 ;
3)插入数据初始化坏块 Begin FOR i IN 1..1000000000 loop for j IN 1..100000 loop Insert into users.s VALUES(i,"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"); end loop; commit; END LOOP; END; 7、验证是否存在坏块 Rman> Backup validate check logical datafile <fileno> ; 或者validate datafile <fileno> block <blockno reported corrupt>, <blockno reported corrupt> ; Select * from v$database_block_corruption ; 例如: validate datafile 11 block 184959440, 184961480,184961481 ; Select * from v$database_block_corruption ; 8、当第7步确认坏块已消除,即可删除测试表 DROP TABLE users.s purge; 9、切换多次日志及checkpoint Alter system switch logfile ; Alter system checkpoint ; 10、删除触发器 DROP trigger CORRUPT_TRIGGER ; DROP trigger corrupt_trigger1 ; DROP trigger corrupt_trigger2 ;更多Oracle相关信息见Oracle 专题页面 http://www.linuxidc.com/topicnews.aspx?tid=12本文永久更新链接地址