Saturday, February 20, 2016

Detecting Soft Corruption in 12c - V$NONLOGGED_BLOCK, ORA-01578/ORA-26040

Last week we have created standby database in our dev environment and performed some ETL actions on primary side. Loading data or rebuilding indexes was performed with NOLOGGING option. After few days we noticed lots ORA-01578/ORA-26040 errors.
Corruption happened because we forgot to enable force logging.

As this was new dev database there wasn’t backup, but maybe not everything was lost. If only corrupted segments are indexes we could easily rebuild them.

Then I’ve learnt something new.
After performing validation check logical, we noticed lots corrupted blocks, but I was puzzled why do I have “v$database_block_corruption” view empty. Then my colleague told me that Oracle changed behaviour in reporting soft corrupted blocks in 12c version (we were using 12.1.0.2). New view was updated - V$NONLOGGED_BLOCK.

So I have created little demo case on how to detect (and repair) soft corrupted blocks on 12c database.



Create tablespace and small table.
SQL> create tablespace DEMO1 datafile '/oradata1/data/ora12c/demo01.dbf' size 50M;
Tablespace created.

SQL> create table objects tablespace DEMO as select * from dba_objects;
Table created.

SQL> alter table objects add constraint pk_obj primary key (object_id);
Table altered.

SQL> create index idx_obj_name on objects(object_name) tablespace demo1;
Index created.

Backup tablespace.
RMAN> backup tablespace DEMO1;

Starting backup at 23-AUG-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=50 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=/oradata1/data/ora12c/demo01.dbf
channel ORA_DISK_1: starting piece 1 at 23-AUG-15
channel ORA_DISK_1: finished piece 1 at 23-AUG-15
piece handle=/oradata1/fra/ORA12C/backupset/2015_08_23/o1_mf_nnndf_TAG20150823T060639_bxlkpj3j_.bkp tag=TAG20150823T060639 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 23-AUG-15

Starting Control File and SPFILE Autobackup at 23-AUG-15
piece handle=/oradata1/fra/ORA12C/autobackup/2015_08_23/o1_mf_s_888473201_bxlkpktg_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 23-AUG-15

Rebuild index with NOLOGGING option to simulate soft corruption later.
RMAN> alter index idx_obj_name rebuild nologging;
Statement processed

Confirm that we have datafiles that require backup because they have been affected with NOLOGGING operation.
RMAN> report unrecoverable;

Report of files that need backup due to unrecoverable operations
File Type of Backup Required Name
---- ----------------------- -----------------------------------
2    full or incremental     /oradata1/data/ora12c/demo01.dbf
5    full or incremental     /oradata1/data/ora12c/example01.dbf

Simulate corruption.
RMAN> alter database datafile 2 offline;
Statement processed

RMAN> restore datafile 2;

Starting restore at 23-AUG-15
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00002 to /oradata1/data/ora12c/demo01.dbf
channel ORA_DISK_1: reading from backup piece /oradata1/fra/ORA12C/backupset/2015_08_23/o1_mf_nnndf_TAG20150823T060639_bxlkpj3j_.bkp
channel ORA_DISK_1: piece handle=/oradata1/fra/ORA12C/backupset/2015_08_23/o1_mf_nnndf_TAG20150823T060639_bxlkpj3j_.bkp tag=TAG20150823T060639
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:03
Finished restore at 23-AUG-15

RMAN> recover datafile 2;

Starting recover at 23-AUG-15
using channel ORA_DISK_1

starting media recovery
media recovery complete, elapsed time: 00:00:01

Finished recover at 23-AUG-15

RMAN> alter database datafile 2 online;
Statement processed

Query table with corrupted index and notice error.
SQL> select count(*) from objects where object_name like 'A%';
select count(*) from objects where object_name like 'A%'
       *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 2, block # 2617)
ORA-01110: data file 2: '/oradata1/data/ora12c/demo01.dbf'
ORA-26040: Data block was loaded using the NOLOGGING option

Let’s perform validation of datafile to check block corruption.
RMAN> backup validate check logical datafile 2;

Starting backup at 23-AUG-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=40 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=/oradata1/data/ora12c/demo01.dbf
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
2    OK     460            129          6401            1776280
  File Name: /oradata1/data/ora12c/demo01.dbf
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              1537
  Index      0              462
  Other      0              4272

Finished backup at 23-AUG-15

Notice that we have 460 blocks marked corrupt but v$database_block_corruption view is empty.
SQL> select count(*) from v$database_block_corruption;

  COUNT(*)
----------
  0

Let’s query v$nonlogged_block view.
SQL> set lines 200
SQL> set pages 999
SQL> select file#, block#, blocks,object#,reason from v$nonlogged_block;

     FILE#     BLOCK#   BLOCKS OBJECT#      REASON
---------- ---------- ---------- ---------------------------------------- -------
  2  2308       12       UNKNOWN
  2  2321       15       UNKNOWN
  2  2337       15       UNKNOWN
  2  2353       15       UNKNOWN
  2  2369       15       UNKNOWN
  2  2385       15       UNKNOWN
  2  2401       15       UNKNOWN
  2  2417       15       UNKNOWN
  2  2434      126       UNKNOWN
  2  2562      126       UNKNOWN
  2  2690       91       UNKNOWN

11 rows selected.


Will RMAN detect that we have corrupted blocks?
RMAN> backup datafile 2;

Starting backup at 23-AUG-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=54 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=/oradata1/data/ora12c/demo01.dbf
channel ORA_DISK_1: starting piece 1 at 23-AUG-15
channel ORA_DISK_1: finished piece 1 at 23-AUG-15
piece handle=/oradata1/fra/ORA12C/backupset/2015_08_23/o1_mf_nnndf_TAG20150823T061602_bxll8275_.bkp tag=TAG20150823T061602 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 23-AUG-15
RMAN backup won’t fail due to NOLOGGING corrupt blocks and our backup will contain soft corrupted blocks.

Let’s Identify corrupt segments using v$nonlogged_block view.
set lines 2000
set pages 9999
col owner for a20
col partition_name for a10
col segment_name for a20

SELECT e.owner, e.segment_type, e.segment_name, e.partition_name, c.file#
     , greatest(e.block_id, c.block#) corr_start_block#
     , least(e.block_id+e.blocks-1, c.block#+c.blocks-1) corr_end_block#
     , least(e.block_id+e.blocks-1, c.block#+c.blocks-1)
       - greatest(e.block_id, c.block#) + 1 blocks_corrupted
  FROM dba_extents e, V$NONLOGGED_BLOCK c
 WHERE e.file_id = c.file#
   AND e.block_id <= c.block# + c.blocks - 1
   AND e.block_id + e.blocks - 1 >= c.block#
UNION
SELECT s.owner, s.segment_type, s.segment_name, s.partition_name, c.file#
     , header_block corr_start_block#
     , header_block corr_end_block#
     , 1 blocks_corrupted
  FROM dba_segments s, V$NONLOGGED_BLOCK c
 WHERE s.header_file = c.file#
   AND s.header_block between c.block# and c.block# + c.blocks - 1
UNION
SELECT null owner, null segment_type, null segment_name, null partition_name, c.file#
     , greatest(f.block_id, c.block#) corr_start_block#
     , least(f.block_id+f.blocks-1, c.block#+c.blocks-1) corr_end_block#
     , least(f.block_id+f.blocks-1, c.block#+c.blocks-1)
       - greatest(f.block_id, c.block#) + 1 blocks_corrupted
  FROM dba_free_space f, V$NONLOGGED_BLOCK  c
 WHERE f.file_id = c.file#
   AND f.block_id <= c.block# + c.blocks - 1
   AND f.block_id + f.blocks - 1 >= c.block#
order by file#, corr_start_block#
/



OWNER       SEGMENT_TYPE SEGMENT_NAME      PARTITION_      FILE# CORR_START_BLOCK# CORR_END_BLOCK# BLOCKS_CORRUPTED
-------------------- ------------------ -------------------- ---------- ---------- ----------------- --------------- ----------------
SYS       INDEX  IDX_OBJ_NAME     2  2308  2311      4
SYS       INDEX  IDX_OBJ_NAME     2  2312  2319      8
SYS       INDEX  IDX_OBJ_NAME     2  2321  2327      7
SYS       INDEX  IDX_OBJ_NAME     2  2328  2335      8
SYS       INDEX  IDX_OBJ_NAME     2  2337  2343      7
SYS       INDEX  IDX_OBJ_NAME     2  2344  2351      8
SYS       INDEX  IDX_OBJ_NAME     2  2353  2359      7
SYS       INDEX  IDX_OBJ_NAME     2  2360  2367      8
SYS       INDEX  IDX_OBJ_NAME     2  2369  2375      7
SYS       INDEX  IDX_OBJ_NAME     2  2376  2383      8
SYS       INDEX  IDX_OBJ_NAME     2  2385  2391      7
SYS       INDEX  IDX_OBJ_NAME     2  2392  2399      8
SYS       INDEX  IDX_OBJ_NAME     2  2401  2407      7
SYS       INDEX  IDX_OBJ_NAME     2  2408  2415      8
SYS       INDEX  IDX_OBJ_NAME     2  2417  2423      7
SYS       INDEX  IDX_OBJ_NAME     2  2424  2431      8
SYS       INDEX  IDX_OBJ_NAME     2  2434  2559    126
SYS       INDEX  IDX_OBJ_NAME     2  2562  2687    126
SYS       INDEX  IDX_OBJ_NAME     2  2690  2780     91

19 rows selected.

This is the best outcome to get if you notice corruption errors. All errors are related to index corruption so we could fix this problem rebuilding index.

SQL> alter index idx_obj_name rebuild;
alter index idx_obj_name rebuild
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 2, block # 2308)
ORA-01110: data file 2: '/oradata1/data/ora12c/demo01.dbf'
ORA-26040: Data block was loaded using the NOLOGGING option

Simply issuing "alter index rebuild" command won't work.
We should mark index unusable to drop segment before rebuilding it or just rebuild index with online option.

It is better choice to mark index unusable because you don't need additional space then, but I will simply rebuild index with online option and see what will happen.
SQL> alter index idx_obj_name rebuild online;
Index altered.

SQL> select count(*) from objects where object_name like 'A%';

  COUNT(*)
----------
      2079

No errors... but, let's validate datafile for corruption.
RMAN> backup validate check logical datafile 2;

Starting backup at 23-AUG-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=40 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=/oradata1/data/ora12c/demo01.dbf
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
2    OK     460            94           6402            1779294
  File Name: /oradata1/data/ora12c/demo01.dbf
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              1537
  Index      0              587
  Other      0              4182

Finished backup at 23-AUG-15
Notice "Marked Corrupt" column. Hm... 460 like before.

Don't worry, this is not new corruption. These are FREE blocks which will be reused and Oracle will automatically re-format those blocks.
set lines 2000
set pages 9999
col owner for a20
col partition_name for a10
col segment_name for a20

SELECT e.owner, e.segment_type, e.segment_name, e.partition_name, c.file#
     , greatest(e.block_id, c.block#) corr_start_block#
     , least(e.block_id+e.blocks-1, c.block#+c.blocks-1) corr_end_block#
     , least(e.block_id+e.blocks-1, c.block#+c.blocks-1)
       - greatest(e.block_id, c.block#) + 1 blocks_corrupted
  FROM dba_extents e, V$NONLOGGED_BLOCK c
 WHERE e.file_id = c.file#
   AND e.block_id <= c.block# + c.blocks - 1
   AND e.block_id + e.blocks - 1 >= c.block#
UNION
SELECT s.owner, s.segment_type, s.segment_name, s.partition_name, c.file#
     , header_block corr_start_block#
     , header_block corr_end_block#
     , 1 blocks_corrupted
  FROM dba_segments s, V$NONLOGGED_BLOCK c
 WHERE s.header_file = c.file#
   AND s.header_block between c.block# and c.block# + c.blocks - 1
UNION
SELECT null owner, null segment_type, null segment_name, null partition_name, c.file#
     , greatest(f.block_id, c.block#) corr_start_block#
     , least(f.block_id+f.blocks-1, c.block#+c.blocks-1) corr_end_block#
     , least(f.block_id+f.blocks-1, c.block#+c.blocks-1)
       - greatest(f.block_id, c.block#) + 1 blocks_corrupted
  FROM dba_free_space f, V$NONLOGGED_BLOCK  c
 WHERE f.file_id = c.file#
   AND f.block_id <= c.block# + c.blocks - 1
   AND f.block_id + f.blocks - 1 >= c.block#
order by file#, corr_start_block#
/


OWNER       SEGMENT_TYPE SEGMENT_NAME      PARTITION_      FILE# CORR_START_BLOCK# CORR_END_BLOCK# BLOCKS_CORRUPTED
-------------------- ------------------ -------------------- ---------- ---------- ----------------- --------------- ----------------
           2  2308  2319     12
           2  2321  2335     15
           2  2337  2351     15
           2  2353  2367     15
           2  2369  2383     15
           2  2385  2399     15
           2  2401  2415     15
           2  2417  2431     15
           2  2434  2559    126
           2  2562  2687    126
           2  2690  2780     91

11 rows selected.

We could force re-formatting creating dummy table and inserting data to dummy table.
Check Doc ID 336133.1.
create table s (
       n number,
       c varchar2(4000)
     ) nologging tablespace DEMO1;


SQL> BEGIN
FOR i IN 1..1000000 LOOP
INSERT /*+ APPEND */ INTO sys.s select i, lpad('REFORMAT',3092, 'R') from dual;
commit ;
END LOOP;
END;
/  2    3    4    5    6    7


BEGIN
*
ERROR at line 1:
ORA-01653: unable to extend table SYS.S by 128 in tablespace DEMO1
ORA-06512: at line 3


SQL> drop table sys.s purge;
Table dropped.

Notice that we don't have corrupted blocks any more.
RMAN> backup validate check logical datafile 2;

Starting backup at 23-AUG-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=67 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=/oradata1/data/ora12c/demo01.dbf
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
2    OK     0              3929         14593           1818933
  File Name: /oradata1/data/ora12c/demo01.dbf
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              9851
  Index      0              461
  Other      0              351

Finished backup at 23-AUG-15



Recovering corrupted index is easy, but recovering data blocks could be slightly difficult or sometimes impossible.
Perform validation and backups regularly because corruption will hit you when you least expect ;)



2 comments:

  1. Very useful and perfectly described, thanks a lot

    ReplyDelete
  2. Very good article, thank you.

    ReplyDelete