Mar 5, 2020

ORA-01194: file 1 needs more recovery to be consistent

Due to some reason, I lost some of archive logs while recovering a database.
I received below error and you may may have seen also the same.

The scenario is "Recovering an Oracle Database with missing Redo/ archive log".

RMAN> recover database;
Starting recover at 02-MAR-20
using channel ORA_DISK_1
starting media recovery
archived log for thread 1 with sequence 67 is already on disk as file /u03/fast_recovery_area/ORCL/archivelog/2020_02_26/o1_mf_1_67_bgzcn05f_.arc
archived log for thread 1 with sequence 69 is already on disk as file /u03/fast_recovery_area/ORCL/archivelog/2020_02_26/o1_mf_1_69_bgzdqo9n_.arc
Oracle Error:
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u03/oradata/ORCL/datafile/o1_mf_system_bh914cx2_.dbf'
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 03/02/2020 08:44:21
RMAN-06053: unable to perform media recovery because of missing log
RMAN-06025: no backup of archived log for thread 1 with sequence 68 and starting SCN of 624986 found to restore
RMAN>


The natural thing to check first when trying to open the database after an incomplete recovery is the fuzziness and PIT (Point In Time) of the datafiles from SQLPlus:

SQL> select fuzzy, status, checkpoint_change#,
     to_char(checkpoint_time, 'DD-MON-YYYY HH24:MI:SS') as  
     checkpoint_time, count(*)
      from v$datafile_header
     group by fuzzy, status, checkpoint_change#, checkpoint_time
     order by fuzzy, status, checkpoint_change#, checkpoint_time;
FUZZY STATUS  CHECKPOINT_CHANGE# CHECKPOINT_TIME        COUNT(*)
----- ------- ------------------ -------------------- ----------
NO    ONLINE              647929 26-FEB-2020 16:58:14          1
YES   ONLINE              551709 26-FEB-2020 15:59:43          4
SQL>

The fact that there are two rows returned and that not all files have FUZZY=NO indicates that we have a problem and that more redo is required before the database can be opened with the RESETLOGS option.

But our problem is that we don’t have that redo and we’re desperate to open our database anyway.

Recovering Without Consistency:

Again, recovering without consistency is not supported and should only be attempted as a last resort.

Opening the database with the data in an inconsistent state is actually pretty simple.  We simply need to set the “_allow_resetlogs_corruption” hidden initialization parameter and set the undo management to “manual” temporarily:

SQL> alter system set "_allow_resetlogs_corruption"=true scope=spfile;
System altered.
SQL> alter system set undo_management='MANUAL' scope=spfile;
System altered.
SQL> shutdown abort;
ORACLE instance shut down.
SQL> startup mount;
..
Database mounted.
SQL>

Now, will the database open? The answer is still: “probably not”.  Giving it a try we get:

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2663], [0], [551715], [0], [562781], [], [], [], [], [], [], []
Process ID: 4538
Session ID: 237 Serial number: 5621
SQL>

Seems there  is another problem in database.  Actually the situation is better than previous. Here we may take it forward.

Of-course ORA-00600 error is too pathetic but here that a datafile has a recorded SCN that’s ahead of the database SCN.  The current database SCN is shown as the 3rd argument (in this case 551715) and the datafile SCN is shown as the 5th argument (in this case 562781).  Hence a difference of :

562781 - 551715 = 11066

In this example, that’s not too large of a gap.  But in a real system, the difference may be more significant.  Also if multiple datafiles are ahead of the current SCN you should expect to see multiple ORA-00600 errors.

The solution to this problem is quite simple: roll forward the current SCN until it exceeds the datafile SCN.  The database automatically generates a number of internal transactions on each startup hence the way to roll forward the database SCN is to simply perform repeated shutdowns and startups.  Depending on how big the gap is, it may be necessary to repeatedly shutdown abort and startup – the gap between the 5th and 3rd parameter to the ORA-00600 will decrease each time.  However eventually the gap will reduce to zero and the database will open:

SQL> connect / as sysdba
Connected to an idle instance.
SQL> shutdown abort
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
..
Database mounted.
Database opened.
SQL>

No comments:

Post a Comment

Translate >>