Friday, August 16, 2013

Loss of Undo tablespace datafile and Process to restore and recover

This is a recovery scenario in which a datafile in a rollback segment
tablespace has been lost or damaged to a point that Oracle cannot recognize it
anymore. Trying to startup the database will result in ORA-1157, ORA-1110, and
possibly an operating system level error such as ORA-7360. Trying to shut down
the database in normal or immediate mode will result in ORA-1116, ORA-1110,
and possibly an operating system level error such as ORA-7368.
Solution
=========
The approach to be followed depends on the specific scenario in which
the loss of the rollback datafile is detected:
I. THE DATABASE IS DOWN
———————–
Attempting to startup the database will result in ORA-1157 and ORA-1110.
The solution here depends on whether the database was cleanly shut down
or not.
I.A. THE DATABASE WAS CLEANLY SHUT DOWN
—————————————
If you are ABSOLUTELY POSITIVE that the database was cleanly shutdown,
i.e., it was closed with either shutdown NORMAL or IMMEDIATE, then
the simplest solution is to offline drop the missing datafile, open the
database in restricted mode, and then drop and recreate the undo
tablespace to which the file belonged.  DO NOT follow this procedure
if the database was shut down ABORT or if it crashed.
The steps are:
1. Make sure the database was last cleanly shut down.
Check the alert.log file for this instance.  Go to the bottom of
the file and make sure the last time you shut the database down
you got the messages:
“Shutting down instance (immediate)”
OR
“alter database close normal
Completed: alter database close normal”
This also includes the case of a clean shutdown followed by a
failed attempt to startup the database.  In that case, Oracle will
issue error messages and shut itself down abort.  For the purposes
of this solution, though, this counts as a clean shutdown.
If that is not the case, i.e., if the last time YOU shut the database
down it was in abort mode, or the database crashed itself, it is
NOT safe to proceed.  You should follow the instructions for
case I.B below.
2. If using automatic UNDO_MANAGEMENT, comment out this entry from the parameter
file, or set it to MANUAL.
If using rollback segments, remove all the rollback segments in the
tablespace to which the lost datafile belongs from the ROLLBACK_SEGMENTS
parameter in the init.ora file for this instance.  If you are not sure about which rollbacks are
in that tablespace, simply comment out the whole ROLLBACK_SEGMENTS entry.
3. Mount the database in restricted mode.
SQL> STARTUP RESTRICT MOUNT
4. Offline drop the lost datafile.
SQL> ALTER DATABASE DATAFILE ‘<full_path_file_name>’ OFFLINE DROP;
5. Open the database.
SQL> ALTER DATABASE OPEN
You should receive the message “Statement processed,”.
If instead you get ORA-604, ORA-376, and ORA-1110, it is likely the shutdown
was not normal/immediate.  Review the rest of the options available and/or
contact Oracle Support Services.
6. Drop the undo tablespace or tablespace which contains rollback segments
to which the datafile belonged.
SQL> DROP TABLESPACE <tablespace_name> INCLUDING CONTENTS;
7. Recreate the undo tablespace.  If using rollback segments, recreate the
rollback segment tablespace and all it’s rollback segments.  Remember to
bring the rollbacks online after you create them.
8. Edit the parameter file setting:
UNDO_MANAGEMENT=AUTO
UNDO_TABLESPACE=<new undo tablespace name>
If using rollback segments, reinclude the rollbacks you just recreated in
the ROLLBACK_SEGMENTS parameter in the init.ora file for this instance.
As rollback segments were brought online in step #7, no need to proceed
with shutdown/startup as needed for undo tablespace.  All that is required
is:
SQL> ALTER SYSTEM DISABLE RESTRICTED SESSION;
9. If using UNDO tablespace, Shutdown and startup the the database.
I.B. THE DATABASE WAS NOT CLEANLY SHUT DOWN
——————————————-
This is the situation where the database was last shut down abort or crashed.
In this case, it is almost certain that the rollback segments that had
extents in the lost datafile still contain active transactions.  Therefore,
the file cannot be offlined or dropped.  You must restore the lost datafile
from a backup and apply media recovery to it.  If the database is in
NOARCHIVELOG mode, you will only succeed in recovering the datafile if the
redo to be applied is within the range of your online logs.  Otherwise, the
last full backup must be restored.  If a backup of the datafile is not
available, please contact Oracle Support Services.
These are the steps:
1. Restore the lost file from a backup.
2. Mount the database:
SQL> STARTUP MOUNT;
3. Issue the following query:
SQL> SELECT FILE#, NAME, STATUS FROM V$DATAFILE;
If the status of the file you just restored is “OFFLINE,” you must
online it before proceeding:
SQL> ALTER DATABASE DATAFILE ‘<full_path_file_name>’ ONLINE;
4. Issue the following query:
SQL> SELECT V1.GROUP#, MEMBER, SEQUENCE#, FIRST_CHANGE#
FROM V$LOG V1, V$LOGFILE V2
WHERE V1.GROUP# = V2.GROUP# ;
This will list all your online redolog files and their respective
sequence and first change numbers.
5. If the database is in NOARCHIVELOG mode, issue the query:
SQL> SELECT FILE#, CHANGE# FROM V$RECOVER_FILE;
If the CHANGE# is GREATER than the minimum FIRST_CHANGE# of your
logs, the datafile can be recovered.  Just keep in mind that all the
logs to be applied will be online logs, and move on to step 6.
If the CHANGE# is LESSER than the minimum FIRST_CHANGE# of your logs,
the file cannot be recovered.  Your options at this point include
restoring a full backup if one is available or forcing the database
to open in an inconsistent state to get a full export out of it.
For further details and to assist you in your decision, please contact
Oracle Support Services.
6. Recover the datafile:
SQL> RECOVER DATAFILE ‘<full_path_file_name>’
7. Confirm each of the  logs that you are prompted for until you receive
the message “Media recovery complete”.  If you are prompted for a
non-existing archived log, Oracle probably needs one or more of the
online logs to proceed with the recovery.  Compare the sequence number
referenced in the ORA-280 message with the sequence numbers of your online
logs.  Then enter the full path name of one of the members of the redo group
whose sequence number matches the one you are being asked for.  Keep entering
online logs as requested until you receive the message “Media recovery
complete”.
8. Open the database:
SQL> ALTER DATABASE OPEN;
II. THE DATABASE IS UP
———————-
If you have detected the loss of the rollback datafile and the database
is still up and running, DO NOT SHUT IT DOWN.  In most cases, it is
simpler to solve this problem with the database up than with it down.
Two approaches are possible in this scenario:
A) The first approach involves creating a new UNDO (or rollback segment)
tablespace, altering the system to use the new and dropping the old.
B) The other approach involves offlining the lost datafile, restoring it from
backup, and then applying media recovery to it to make it consistent
with the rest of the database.  This method can only be used if
the database is in ARCHIVELOG mode.
In general, approach IIA is simpler to try first, however, depending on
activity and undo usage, the drop of the older tablespace may result
in error regarding active transactions.  There are a few things to try to
get the older tablespace to drop, but if all fail, approach IIB may be required.
APPROACH II.A: RECREATING THE ROLLBACK TABLESPACE
————————————————-
This approach can be used regardless of the archival mode of the database.
The steps are:
For undo tablespaces:
…………………
1. Create a new undo tablespace:
SQL> CREATE UNDO TABLESPACE <new undo tablepace name>
DATAFILE ‘<new location and name of datafile>’ SIZE ….;
2. Alter the system to use the new undo tablespace:
SQL> ALTER SYSTEM SET UNDO_TABLESPACE=’<new undo tablepace name>’;
At this point, all new undo will be allocated in undo segments managed within
the UNDO_TBS2 tablespace.
3. Try dropping older undo tablespace:
SQL> DROP TABLESPACE <old undo tablespace name> INCLUDING CONTENTS;
If this succeeds, you are done.  If it returns an error, please see
“Handling errors dropping undo/rollback segments”.
For rollback segment tablespaces:
……………………………
1. Create new tablespace to house rollback segments.  For spacing and extent
sizing, see information in dba_tablespaces for old rollback segment tablespace.
SQL> CREATE TABLESPACE …..
2. Create rollback segments in this tablespace
SQL> CREATE ROLLBACK SEGMENT…..
3. Online the rollback segments in this new tablespace:
SQL> ALTER ROLLBACK SEGMENT <new_rollback_segment> ONLINE;
Repeat for all rollback segments.
4. Try to offline all the rollback segments in the tablespace to which
the lost datafile belongs.
SQL> ALTER ROLLBACK SEGMENT <rollback_segment> OFFLINE;
Repeat this statement for all rollbacks in the tablespace.
5. Check the status of the rollbacks.
They must all be offline before they can be dropped. Issue the query:
SQL> SELECT SEGMENT_NAME, STATUS FROM DBA_ROLLBACK_SEGS
WHERE TABLESPACE_NAME = ‘<TABLESPACE_NAME>’;
6. Drop all offlined rollback segments.
For each rollback returned by the query in step 2 with status
“OFFLINE,” issue the statement:
SQL> DROP ROLLBACK SEGMENT <rollback_segment>;
7. Drop the rollback segment tablespace:
SQL> DROP TABLESPACE <tablespace_name> INCLUDING CONTENTS;
If this succeeds, you are done.  If it returns an error, please see
“Handling errors dropping undo/rollback segments”.
HANDLING ERRORS DROPPING UNDO/ROLLBACK SEGMENTS:
———————————————–
NOTE, although automatic undo and undo tablespace is used by the database,
the following procedure is the same.  The only difference is that the segment
names are assigned by Oracle.
1. Check for any rollback segments online.
SQL> SELECT SEGMENT_NAME, STATUS FROM DBA_ROLLBACK_SEGS
WHERE TABLESPACE_NAME = ‘<OLD TABLESPACE_NAME>’;
If any of the rollbacks you tried to offline still has an “ONLINE” status,
this is usually an indication that this segment contains active transactions.
2. Check for active transactions with the following query:
SQL> SELECT SEGMENT_NAME, XACTS ACTIVE_TX, V.STATUS
FROM V$ROLLSTAT V, DBA_ROLLBACK_SEGS
WHERE TABLESPACE_NAME = ‘<TABLESPACE_NAME>’ AND SEGMENT_ID = USN;
If the above query returns no rows, it means all the rollbacks in
the affected tablespace are already offline.  Repeat the query in
step 1 to retrieve the names of the rollbacks that just became
offline and attempt to drop the undo tablespace or individual
rollback segments as described above.
If the above query returns one or more rows, they should show
status “PENDING OFFLINE”.  Next, check the ACTIVE_TX
column for each rollback.  If it has a value of 0, it implies
there are no pending transactions left in the rollback, and it
should go offline shortly.  Repeat the query in step 1 a few
more times until it shows the rollback segments being offline and
then attempt the drop again.  Once the drop is successful, you are done.
If any of the “pending offline” rollbacks has a value of 1 or
greater in the ACTIVE_TX column, move on to step 3.
3. Force rollbacks with active transactions to go offline.
At this point, the only way to move forward is to have the
“pending offline” rollbacks released.  The active transactions
in these rollbacks must either be committed or rolled back.
The following query shows which users have transactions assigned
to which rollbacks:
SQL> SELECT S.SID, S.SERIAL#, S.USERNAME, R.NAME “ROLLBACK”
FROM V$SESSION S, V$TRANSACTION T, V$ROLLNAME R
WHERE R.NAME IN (‘<PENDING_ROLLBACK_1>’, … , ‘<PENDING_ROLLBACK_N>’)
AND S.TADDR = T.ADDR AND T.XIDUSN = R.USN;
You may directly contact the users with transactions in the
“pending offline” rollbacks and ask them to commit (preferably)
or rollback immediately.  If that is not feasible, you can force
that to happen by killing their sessions.  For each of the entries
returned by the above query, issue the statement:
SQL> ALTER SYSTEM KILL SESSION ‘<SID>, <SERIAL#>’;
where <SID> and <SERIAL#> are those returned by the previous
query.  After the sessions are killed, it may take a few minutes
before Oracle finishes rolling back and doing cleanup work.  Go back
to step 1 and repeat the query in there periodically until all
rollbacks in the affected tablespace are offline and ready to be
dropped.
If you are unable to drop the older undo tablespace or rollback segments,
try to restore the file from backup and recover it fully (approach II.B).
Once the file is recovered try to drop the older undo tablespace.  If this
is not possible, contact Oracle Support Services.
APPROACH II.B: RESTORING THE DATAFILE FROM BACKUP
————————————————-
As mentioned before, this approach can only be followed if the database is
in ARCHIVELOG mode.  Here are the steps:
1. Offline the lost datafile.
SQL> ALTER DATABASE DATAFILE ‘<full_path_file_name>’ OFFLINE;
NOTE: Depending on the current amount of database activity,
you may have to create additional rollback segments in a different
tablespace to keep the database going while you take care of the problem.
2. Restore the datafile from a backup.
3. Issue the following query:
SQL> SELECT V1.GROUP#, MEMBER, SEQUENCE#
FROM V$LOG V1, V$LOGFILE V2
WHERE V1.GROUP# = V2.GROUP# ;
This will list all your online redolog files and their respective
sequence numbers.
4. Recover the datafile:
SQL> RECOVER DATAFILE ‘<full_path_file_name>’
5. Confirm each of the  logs that you are prompted for until you receive
the message “Media recovery complete”.  If you are prompted for a
non-existing archived log, Oracle probably needs one or more of the
online logs to proceed with the recovery.  Compare the sequence number
referenced in the ORA-280 message with the sequence numbers of your online
logs.  Then enter the full path name of one of the members of the redo group
whose sequence number matches the one you are being asked for.  Keep
entering online logs as requested until you receive the message
“Media recovery complete”.
6. Bring the datafile back online.
SQL> ALTER DATABASE DATAFILE ‘<full_path_file_name>’ ONLINE;

No comments: