记一次Oracle坏块修复过程

昨天接备份同事电话反应在进行RMAN冷备的过程中报如下错，某个表空间备份失败。
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ch00 channel at 04/25/2015 22:02:30
ORA-19566: exceeded limit of 0 corrupt blocks for file +DATA/dbrac/datafile/tbs_11.11.435678937经过分析发现坏块导致，且坏块不属于任何对象（空块），以下是本次坏块修复步骤：
1、查找坏块
1）使用RMAN查找坏块
验证整个数据库：
Rman> backup validate check logical database ;
注：当数据库版本低于11g且非归档模式，以上命令必须在数据库处于mounted状态执行
验证单个datafile
Rman> backup validate check logical datafile 11 ;
而后执行以下SQL查看坏块：
SQL>Select * from v$database_block_corruption ;例如：
validate.sh
#！/bin/bash
source /home/Oracle/.bash_profile
$ORACLE_HOME/bin/rman log=/home/oracle/users/validate.log <<EOF
connect target /
Backup validate check logical datafile 11 ;
exit;
EOF 2）使用DBV查找坏块：
dbv userid=system/system file="+DATA/dbrac/datafile/tbs_11.11.435678937" blocksize=32768 2、确认坏块是否不属于任何对象
select segment_name, segment_type, owner
from dba_extents
where file_id = <Absolute file number>
and <corrupted block number> between block_id
and block_id + blocks -1;
例如：
alter session force parallel query parallel 10;
select segment_name, segment_type, owner
from dba_extents
where file_id = 11
and 184959440 between block_id
and block_id + blocks -1;

3、确认块在 dba_free_space存在
Select * from dba_free_space where file_id= <Absolute file number>
and <corrupted block number> between block_id and block_id + blocks -1;
例如：
Select * from dba_free_space where file_id= 11 and 184959440 between block_id and block_id + blocks -1;

4、创建表
create table s （
n number,
c varchar2（4000）
） nologging tablespace <tablespace name having the corrupt block> pctfree 99;
例如：
create table users.s （
n number,
c varchar2（4000）
） nologging tablespace TBS_11 pctfree 99;
select segment_name,tablespace_name from dba_segments
where segment_name="S" ;

Select table_name,tablespace_name from dba_tables where table_name="S" ;

5、创建触发器
CREATE OR REPLACE TRIGGER corrupt_trigger
AFTER INSERT ON users.s
REFERENCING OLD AS p_old NEW AS new_p
FOR EACH ROW
DECLARE
corrupt EXCEPTION;
BEGIN
IF （dbms_rowid.rowid_block_number（:new_p.rowid）=&blocknumber）
and （dbms_rowid.rowid_relative_fno（:new_p.rowid）=&filenumber） THEN
RAISE corrupt;
END IF;
EXCEPTION
WHEN corrupt THEN
RAISE_APPLICATION_ERROR（ -20000, "Corrupt block has been formatted"）;
END;
/ 6、分配空间创建在有坏块的datafile上的表
注：
i）因为ASSM会自动确定下一个区段的大小，所以在ASSM的表空间上，需要创建多个表及
不断的分配空间给这些表，直到坏块被分配至其中一个对象。
ii）设置datafile的AUTOEXTEND为OFF 1）查找坏块的extent size
Select BYTES from dba_free_space where file_id=<file no> and <corrupt block no> between
block_id and block_id + blocks -1;例如：
alter database datafile "+DATA/dbrac/datafile/tbs_11.11.435678937" autoextend off; SQL> Select BYTES from dba_free_space where file_id=11 and 184959440 between
2 block_id and block_id + blocks -1;
BYTES
----------
29360128 2）不断allocate直到坏块是S表的一部分
如果步骤1输出结果是64K，执行以下SQL：
alter table users.s
allocate extent （DATAFILE "+DATA/dbrac/datafile/tbs_11.11.435678937" SIZE 64K）;如果大于64K使用以下
BEGIN
for i in 1..1000000 loop
EXECUTE IMMEDIATE "alter table users.s allocate extent （DATAFILE "||"""+DATA/dbrac/datafile/tbs_11.11.435678937"""||"SIZE 64K） ";
end loop;
end ;
/
使用如下SQL查询坏块是否已属于某个对象：
select segment_name, segment_type, owner
from dba_extents
where file_id = <Absolute file number>
and <corrupt block number> between block_id
and block_id + blocks -1 ;
例如：
select segment_name, segment_type, owner
from dba_extents
where file_id = 11
and 184959440 between block_id
and block_id + blocks -1 ;

3）插入数据初始化坏块
Begin
FOR i IN 1..1000000000 loop
for j IN 1..100000 loop
Insert into users.s VALUES（i,"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"）;
end loop;
commit;
END LOOP;
END; 7、验证是否存在坏块
Rman> Backup validate check logical datafile <fileno> ;
或者validate datafile <fileno> block <blockno reported corrupt>, <blockno reported corrupt> ;
Select * from v$database_block_corruption ;
例如：
validate datafile 11 block 184959440, 184961480,184961481 ;
Select * from v$database_block_corruption ; 8、当第7步确认坏块已消除，即可删除测试表
DROP TABLE users.s purge; 9、切换多次日志及checkpoint
Alter system switch logfile ;
Alter system checkpoint ; 10、删除触发器
DROP trigger CORRUPT_TRIGGER ;
DROP trigger corrupt_trigger1 ;
DROP trigger corrupt_trigger2 ;更多Oracle相关信息见Oracle 专题页面 http://www.linuxidc.com/topicnews.aspx？tid=12本文永久更新链接地址