Tuesday, July 14, 2009

Beginning PHP and Oracle From Novice to Professional by W. Jason Gilmore and Bob Bryla Chapter 40

Ensuring database availability is a critical skill you need even if your Oracle Database XE instance is used by a small group of developers in your department. Many types of database failures are beyond
your control as a DBA, such as disk failures, network failures, and user errors. This emphasizes the need to prepare in advance for all of these potential failures after assessing the cost of database down- time versus the effort required to harden your database against failure. Many of these failures, as you might expect, require you to work closely with the server system administrators and network admin- istrators to minimize the impact. You need to promptly receive notification when failures occur, or a warning when they are about to occur.
In this chapter, we start by presenting you with Oracle’s recommended best practices for ensuring the recoverability of your database when, not if, you have a database failure. If your database is a production database that must be available continuously, these requirements are mandatory. On the other hand, if your database is for development, an occasional backup may suffice. However, by using Oracle’s best practices, your downtime will be minimal in the event of a failure, giving you more time to focus on PHP application development instead of data recovery.
Next, we show you how to back up your database, using the Oracle Database XE scripts. Once you have backed up your database, you will need to know how to recover the database from a media failure such as a missing or corrupted datafile.

Backup and Recovery Best Practices

Oracle recommends several techniques you can use to ensure database availability and recoverability. Many of these techniques are automatically implemented when you install Oracle Database XE. However, there are a couple of places where you can tweak the default configuration to improve the recoverability further. We discuss these tweaks in the sections that follow.
Before we dig in to the recoverability and availability techniques, it is important to know the
types of failures you may encounter in your database so that you may respond appropriately when they occur. Database failures fall into two broad categories: media failures and nonmedia failures.
Media failures occur when a server disk or a disk controller fails and makes one or more of your database’s datafiles unusable (see Chapter 28 for an overview of Oracle Database XE’s storage struc- tures). After the hardware error is resolved (e.g., the server administrator replaces the disk drive), it is your responsibility to restore the corrupted or destroyed datafiles from a disk or tape backup. As the price of disk space falls, the added level of convenience and speed of disk makes tape backups less desirable except for archival purposes.
Nonmedia failures include all other types of failures. Here are the most common types of nonmedia failures and how you will deal with them:

687

• Statement failure: Your SQL statement fails because of a syntax problem, or your permissions do not allow you to execute the statement. The recovery process for fixing this error is relatively easy: use the correct syntax or obtain permissions on the objects in the SQL statement.
• Instance failure: The entire database fails due to a power failure, server hardware failure, or a bug in the Oracle software. Recovery from this type of failure is automatic: once the server hardware failure is fixed or the power is restored, Oracle Database XE uses the online redo log files to ensure that all committed transactions are recorded in the database’s datafiles. In the case of a possible Oracle software bug, your next step after restarting the database is to inves- tigate whether there is a patch file or a workaround for the software bug.
• Process failure: A user may be disconnected from the database due to a network connection failure or an exceeded resource limit (such as too much CPU time). The Oracle Database XE back- ground processes automatically clean up by freeing the memory used by the user connection and roll back any uncommitted transactions started during the user’s session.
• User error: A user may drop a table or delete rows from a table unintentionally.

Multiplexing Redo Log Files

As you remember from Chapter 28, the online redo log files are a key component required to recover from both instance failure and media failure. By default, Oracle Database XE creates the minimum number of redo log files (two). When the first redo log file fills with committed transactions, subse- quent transactions are written to the other redo log file. Whether you have two, three, or more redo log files, Oracle writes to the log files in a circular fashion. Thus, if you have ARCHIVELOG mode enabled, Oracle can write new transactions to the next redo log file while Oracle archives the previous online
log file. (We show you how to enable ARCHIVELOG mode in a coming section.)

■Note The terms redo log file and online redo log file are often used interchangeably. However, the distinction is
important when you are comparing online redo log files to archived (offline) redo log files.

To prevent loss of data if you lose one of the online redo log files, you can multiplex, or mirror, the redo log files. In other words, each redo log file, whether there are two, three, or more, has one or more identical copies. These copies are maintained automatically by Oracle processes. Writing to a specific log file occurs in parallel with all other log files in the group. While there is a very slight performance hit when the Oracle processes must write to two copies of the redo log file instead of just one, the slight overhead is easy to justify compared to the recovery time (including lost committed transactions) if a nonmultiplexed online log file is lost due to a hardware failure or other error. To see the current status of the online redo log files, start at the Oracle Database XE home page and navigate
to Administration ➤ Storage. In the Tasks section on the right side of the page, click View Logging
Status and you will see the names and status of the online redo log files. By default, Oracle Database
XE creates two online redo log files, as you can see in Figure 40-1.
Notice the directory path for the redo log files:

/usr/lib/oracle/xe/app/oracle/flash_recovery_area

Figure 40-1. Online redo log file status

This area, as you might surmise, is known as the Flash Recovery Area. The Flash Recovery Area automates the management for backups of all types of database objects such as multiplexed copies of the control file and online redo log files, archived redo log files, and datafiles. You specify the loca- tion of the Flash Recovery Area along with a maximum size, and Recovery Manager (RMAN) manages files within this area. You define the location and size of the Flash Recovery Area with two initializa- tion parameters. From the SQL command-line prompt run this command:

show parameter db_recov

You will see all parameters in the database that begin with db_recov:

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------ db_recovery_file_dest string /usr/lib/oracle/xe/app/oracle/
flash_recovery_area
db_recovery_file_dest_size big integer 10G

You can also view these parameters from the Home ➤ Administration ➤ About page in the Oracle Database XE GUI. To ensure prompt and easy recovery of any database object, your Flash Recovery Area should be large enough to hold at least one copy of all datafiles, incremental backups, online redo log files, control files, and any archived redo log files required to restore a database from the last full or incremental backup to the point in time of a media failure. You can check the status of the Flash Recovery Area by querying the dynamic performance view V$RECOVERY_FILE_DEST:

select name, space_limit, space_used from v$recovery_file_dest;

NAME SPACE_LIMIT SPACE_USED
-------------------------------------------------- ------------- ----------
/usr/lib/oracle/xe/app/oracle/flash_recovery_area 10737418240 851753472

Of the 10GB of space available in the Flash Recovery Area, less than 900MB is used. Multiplexing these redo log files is easy; the only catch is that there is no GUI interface available
for this operation—you must use a couple of SQL commands. You will put the multiplexed redo log files in the directory /u01/app/oracle/onlinelog. This file system is on a separate disk drive and a separate controller from the redo log files shown earlier in Figure 40-1. Connect as a user with SYSDBA privileges, and use these SQL statements:

alter database add logfile member
'/u01/app/oracle/onlinelog/g1m2.log' to group 1;
alter database add logfile member
'/u01/app/oracle/onlinelog/g2m2.log' to group 2;
Notice that you do not need to specify a size for the new redo log file group members; all files within the same redo log file group must have the same size, so Oracle automatically uses the file size of the files within the existing group. After you run these statements, you revisit the Database Logging page shown earlier in Figure 40-1, and you now see the same log file groups but with each having a multiplexed member, as shown in Figure 40-2.

Figure 40-2. Multiplexed online redo log files

Multiplexing Control Files

As you may remember from Chapter 28, the control file maintains the metadata for the physical structure of the entire database. It stores the name of the database, the names and locations of the tablespaces in the database, the locations of the redo log files, information about the last backup of each tablespace in the database, and much more. It may be one of the smallest yet most critical files in the database. If you have two or more multiplexed copies of the control file and you lose one, it is a very straightforward recovery process. However, if you have only one copy and you lose it due to corruption or hardware failure, the recovery procedure becomes very advanced and time consuming.
By default, Oracle Database XE creates only one copy of the control file. To multiplex the control
file, you need to follow a few simple steps. First, identify the location of the existing control file using the Home ➤ Administration ➤ About Database page, or use the following query:

select value from v$parameter where name = 'control_files';

On Linux you will see something similar to the following:

VALUE
-----------------------------------------------------
/usr/lib/oracle/xe/oradata/XE/control.dbf

The next step is to alter the SPFILE (see Chapter 28 for a discussion on types of parameter files) to add the location for the second control file. We will use the location /u01/app/oracle/controlfile to store the second copy of the control file. Here is the SQL statement you use to add the second location:

alter system set control_files =
'/usr/lib/oracle/xe/oradata/XE/control.dbf',
'/u01/app/oracle/controlfile/control2.dbf' scope=spfile;

Be sure to use SCOPE=SPFILE here, as in the example, since you cannot dynamically change the
CONTROL_FILES parameter while the database is open. Next, you must shut down the database as follows:

shutdown immediate

On Linux, you use your favorite GUI or operating system command line to make a copy of the first control file in the second location:

cp /usr/lib/oracle/xe/oradata/XE/control.dbf \
/u01/app/oracle/controlfile/control2.dbf

Finally, restart the database using this command at the SQL> prompt:

startup

Checking the dynamic performance view V$PARAMETER again, you can see that there are now two copies of the control file:

VALUE
--------------------------------------------------------------------------------
/usr/lib/oracle/xe/oradata/XE/control.dbf,
/u01/app/oracle/controlfile/control2.dbf

As with the members of a redo log file group, any changes to the control file are made to all copies. As a result, the loss of one control file is as easy as shutting down the database (if it is not down already), copying the remaining copy of the control file to the second location, and restarting the database.

Enabling ARCHIVELOG Mode

A database in ARCHIVELOG mode automatically backs up a filled online redo log file after the switch to the next online redo log file. Although this requires more disk space, there are two distinct advantages to using ARCHIVELOG mode:

• After media failure, you can recover all committed transactions up to the point in time of the media failure if you have backups of all archived and online redo log files since the last backup, the control file from the most recent backup, and all datafiles from the last backup.
• You can back up the database while it is online. If you do not use ARCHIVELOG mode, you must shut down the database to perform a database backup. This is an important consideration when you must have your database available to users 24 hours a day, 7 days a week.

By default, an Oracle Database XE installation is in NOARCHIVELOG mode. If your database is used primarily for development and you make occasional full backups of the database, this may be suffi- cient. However, if you use your database in a production environment, you should use ARCHIVELOG mode to ensure that no user transactions are lost due to a media failure. To enable ARCHIVELOG mode, perform the following steps. First, connect to the database with SYSDBA privileges, and shut down the database:

shutdown immediate

Next, start up the database in MOUNT mode. This mode reads the contents of the control file and starts the instance but does not open the datafiles:

startup mount

ORACLE instance started.

Total System Global Area 146800640 bytes Fixed Size 1257668 bytes Variable Size 88084284 bytes Database Buffers 54525952 bytes Redo Buffers 2932736 bytes Database mounted.

Next, enable ARCHIVELOG mode with this command:

alter database archivelog;

Finally, open the database:

alter database open;

The Oracle Database XE home page’s Usage Monitor section now indicates the new status of the database, as you can see in Figure 40-3.

Figure 40-3. Database status after enabling ARCHIVELOG mode

After you perform one full backup of the database, the archived and online log files will ensure that you will not lose any committed transactions due to media failure. In addition, to save disk space you can purge (or move to tape and then purge) all archived redo log files and previous backups created before the full backup. Only those archived redo log files created since the last full backup are needed to recover the database when a media failure occurs; the combination of a full backup and subsequent archived redo log files will ensure that you will not lose any committed transactions. The previous full backups and subsequent archived redo log files created before the latest full backup will only be useful if you need to restore the database to a point in time before the most recent full backup.

Backing Up the Database

Now that you have multiplexed your online redo log files, multiplexed your control files, and enabled ARCHIVELOG mode in your database, you are ready for your first full backup of the database. Any media recovery operation requires at least one full backup of the database, even if you are not in ARCHIVELOG mode. (Remember that an instance failure requires only the online redo log files for recovery.) You can back up manually, or schedule an automatic backup at regular intervals. We cover both of these scenarios in the following sections.

Manual Backups
Whether you are using Linux or Windows as your operating system, performing a manual backup is very straightforward. Under Linux, start with the Applications menu under Gnome, or the K menu if you’re using KDE, select Oracle Database 10g Express Edition ➤ Backup Database. For Windows, from the Start menu, select Programs ➤ Oracle Database 10g Express Edition ➤ Backup Database. In both cases, a console window launches so that you can interact with the backup script. This inter-
action occurs only if you are not in ARCHIVELOG mode. The backup script warns you that Oracle will shut down the database before a full backup can occur.
For a full backup under Linux, the output in the console window looks similar to if not exactly
like this:

Doing online backup of the database. Backup of the database succeeded.
Log file is at /usr/lib/oracle/xe/oxe_backup_current.log.
Press ENTER key to exit

The script output identifies the log file location. Oracle keeps the two most recent log files. The previous log file is at this location:

/usr/lib/oracle/xe/oxe_backup_previous.log

The log file contains the results of one or more RMAN sessions. After the backup completes, RMAN deletes all obsolete backups. By default, Oracle only keeps the last two full backups. If you are using Windows as your host operating system, the backup logs reside in these locations:

C:\ORACLEXE\APP\ORACLE\PRODUCT\10.2.0\SERVER\DATABASE\OXE_BACKUP_CURRENT.LOG C:\ORACLEXE\APP\ORACLE\PRODUCT\10.2.0\SERVER\DATABASE\OXE_BACKUP_PREVIOUS.LOG

Automatic Backups

Scheduling automatic backups is very straightforward. Oracle Database XE provides a script for each platform that you can launch using your favorite scheduling program, such as the cron program under Linux or the Scheduled Tasks wizard under Windows.
For Linux, the script is located here:

/usr/lib/oracle/xe/app/oracle/product/10.2.0/server/config/scripts/backup.sh

For Windows, the script is located here:

C:\oraclexe\app\oracle\product\10.2.0\server\BIN\BACKUP.BAT

The log files for each platform are located in the same location as if you ran the scripts manually.

Recovering Database Objects

Eventually disaster will strike and you will lose one of your key database files, either a datafile, a control file, or an online redo log file, due to a hardware failure or an administrator error. In the following scenario, one of the datafiles is accidentally deleted and you must recover the database back to the point of time where the database failed due to the missing datafile.
In a default Oracle Database XE installation, you have four datafiles. You can use the dynamic performance view V$DATAFILE to identify these datafiles:

select name from v$datafile;

NAME
--------------------------------------------------
/usr/lib/oracle/xe/oradata/XE/system.dbf
/usr/lib/oracle/xe/oradata/XE/undo.dbf
/usr/lib/oracle/xe/oradata/XE/sysaux.dbf
/usr/lib/oracle/xe/oradata/XE/users.dbf

The system administrator performs some routine disk space reclamation and accidentally deletes one of the datafiles on the Linux server:

rm /usr/lib/oracle/xe/oradata/XE/users.dbf

You immediately get phone calls from your users because all user tables are stored in the USERS tablespace, which in turn is stored in the operating system file /usr/lib/oracle/xe/oradata/XE/ users.dbf. A user reports seeing the error message shown in Figure 40-4 when she tries to browse the contents of one of her tables.

Figure 40-4. User error messages after the loss of a datafile

Your first thought is that there must be some error other than a missing datafile, so you first try to shut down and restart the database to see what happens. The startup messages look normal at first, but then after the database is mounted you see an error message similar to the following:

ORACLE instance started.

Total System Global Area 146800640 bytes Fixed Size 1257668 bytes Variable Size 88084284 bytes Database Buffers 54525952 bytes Redo Buffers 2932736 bytes Database mounted.
ORA-01157: cannot identify/lock data file 4 - see DBWR trace file
ORA-01110: data file 4: '/usr/lib/oracle/xe/oradata/XE/users.dbf'

You decide that a database recovery is your only option. From either the Windows or Linux GUI interface, select Restore Database from the same menu where you selected Backup Database, as noted earlier in the chapter, to back up the database. A command window opens to ensure that you
know a shutdown must occur to restore the database to its previous state:

This operation will shut down and restore the database. Are you sure [Y/N]?

After you type Y, the restore operation proceeds with no further intervention other than to confirm that the operation is complete:

This operation will shut down and restore the database. Are you sure [Y/N]?y
Restore in progress...
Restore of the database succeeded.
Log file is at /usr/lib/oracle/xe/oxe_restore.log. Press ENTER key to exit

The restore operation automatically starts the database after completion. If you are curious as to which RMAN commands were used to recover from the media failure (deleted datafile), you can look in the log file identified by the script at /usr/lib/oracle/xe/oxe_restore.log.
When the user reloads the Web page containing her Oracle Database XE session, she suddenly sees the table she was attempting to browse, as shown in Figure 40-5.
The user didn’t even have to log out and log back in; when the database was back up (after being
shut down and restarted several times, including an attempt by the DBA to shut down and restart), refreshing the page logged the user back in behind the scenes and kept her on the page where she left off. She was none the wiser about the multiple database shutdowns and restarts; all of her committed transactions are still in the database as well.

Figure 40-5. Refreshed Web page after media recovery

Summary

This chapter gave you the basics for backing up and recovering your database. Although backup and recovery operations are not the most glamorous of tasks compared to application development, a database that is down because of a disk failure quickly becomes highly visible to upper management when the PHP developers (including yourself) are not able to store and retrieve application data. Implementing Oracle’s best practices to ensure database availability include multiplexing redo log files and control files, enabling ARCHIVELOG mode to ensure recoverability from media failure, and leveraging the Flash Recovery Area to quickly recover from media failure or user error.

0 comments: