Wednesday, August 12, 2009

MORE MAGIC METHODS

So far we have come across the magic meth- ods __construct, __destruct, and __toString,
and have discussed them in detail. The remaining magic methods are __autoload,
__call, __clone, __get, __set, __sleep, __wakeup, __unset,
and __isset.1 As you might expect, they only make sense in the context of object-oriented programming (OOP).
The syntactic element common to all magic methods is that they begin with a double underscore. They are all also usually invoked indirectly rather than directly. As we have seen, the __construct method of a class is invoked when we use the new operator and a class name. If we have a class called MyClass that defines a constructor, the statement $m = new MyClass(); indirectly calls the __construct method of this class.
However, the fact that all magic methods are called indirectly masks important differences between them. Having a uniform constructor for every class yields benefits when a parent constructor needs to be called, but there is no intrinsic need for this method to be magic. For example, in Java, construc- tors bear the name of the class with no serious negative consequences. On the other hand, destructors are a necessity and would seem to have to be magic. They are not invoked by any action of the developer, but automatically when an object goes out of scope. Then there’s the __toString method, which is called implicitly whenever an object is displayed using print or echo—a convenience method more than anything else. In any case, the point is that the reasons for providing magic methods are various and in each case worth examining.
In this chapter we will look at those magic methods that we haven’t yet discussed. Related and complementary methods will be discussed together.

To set the context for this discussion, remember that we spent some time discussing accessor, or set and get methods, in Chapter 6. There I argued that instance variables should be made private and only retrieved or changed through accessor methods. Doing otherwise violates the object-oriented (OO) principle of data hiding (or encapsulation if you prefer) and leaves instance variables exposed to inadvertent changes.
PHP 5 introduces magic set and get methods for undefined instance vari- ables. Let’s see what this means by looking at an example. Suppose you have a class, Person, devoid of any data members or methods, defined as follows:

class Person{
}

PHP allows you to do the following:

$p = new Person();
$p->name = "Fred";
$p->street = "36 Springdale Blvd";

Even though name and street data members have not been declared within the Person class, you can assign them values and, once assigned, you can retrieve those values. This is what is meant by undefined instance variables. You can cre- ate magic set and get methods to handle any undefined instance variables by making the following changes to the Person class, as shown in Listing 13-1.

class Person{
protected $datamembers = array();
public function __set($variable, $value){
//perhaps check value passed in
$this->datamembers[$variable] = $value;
}
public function __get($variable){
return $this->datamembers[$variable];
}
}
$p = new Person();
$p-> name = "Fred";

You add an array to your class and use it to capture any undeclared instance variables. With these revisions, assigning a value to an undeclared data member called name invokes the __set method in the background, and an array element with the key name will be assigned a value of “Fred.” In a similar fashion the __get method will retrieve name.

Is It Worth It?

Magic set and get methods are introduced as a convenience, but it is certainly questionable whether they are worth the effort. Encouraging the use of undefined data members can easily lead to difficulties when debugging. For instance, if you want to change the value of the name data member of your Person class instance, but misspell it, PHP will quietly create another instance variable. Setting a nonexistent data member produces no error or warning, so your spelling error will be difficult to catch. On the other hand, attempting to use an undefined method produces a fatal error. For this reason, declaring data members to be private (or protected), and ensuring that they are only accessible through declared accessor methods, eliminates the danger of accidentally creating a new unwanted data member. Using declared data members means fewer debugging problems.
Undeclared data members also seem contrary to the principles of OOP. Although you might argue that encapsulation has been preserved because undeclared data members are only accessed indirectly through the magic methods, the real point of accessor methods is to control how instance variables are changed or retrieved. The comment inside the __set method (//perhaps check value passed in) in Listing 13-1 suggests that such controls could be implemented, but in order to do so you would need to know the vari- able names beforehand—an impossibility given that they are undeclared. Why not just set up properly declared data members?
Allowing undeclared data members also undermines another basic con- cept of OOP, namely inheritance. It’s hard to see how a derived class might inherit undeclared instance variables.
One might argue, though, that these magic methods make PHP easier to use and this convenience offsets any of the disadvantages. After all, the original and continuing impetus behind PHP is to simplify web development. Allowing undeclared data members in PHP 5 is perhaps a necessary evil because doing so keeps backward compatibility with PHP 4. While it’s easy to criticize magic set and get methods, in Chapter 16, when discussing the PDORow class, you’ll see that these methods can come in very handy.

isset and unset

PHP 5.1.0 introduces the magic methods __isset and __unset. These methods are called indirectly by the built-in PHP functions isset and unset. The need for these magic methods results directly from the existence of magic set and get methods for undeclared data members. The magic method __isset will be called whenever isset is used with an undeclared data member.

Suppose you want to determine whether the name variable of your Person instance in Listing 13-1 has been set. If you execute the code isset($t->name);, the return value will be false. To properly check whether an undeclared data member has been set, you need to define an __isset method. Redo the code for the Person class to incorporate a magic isset method (see Listing 13-2).

class Person{
protected $datamembers = array();
private $declaredvar = 1;
public function __set($variable, $value){
//perhaps check value passed in
$this->datamembers[$variable] = $value;
}
public function __get($variable){
return $this->datamembers[$variable];
}
function __isset($name){
return isset($this->datamembers[$name]);
}
function getDeclaredVariable(){
return $this->declaredvar;
}
}
$p = new Person();
$p->name = 'Fred';
echo '$name: '. isset($p-> name). '<br />';//returns true
$temp = $p->getDeclaredVariable();
echo '$declaredvar: '. isset( $temp). '<br />';//returns true
true true

Listing 13-2: The Person class with a magic __isset method

Calling isset against the undeclared data member name will return true because an implicit call is made to the __isset method. Testing whether
a declared data member is set will also return true, but no call, implicit or otherwise, is made to __isset. We haven’t provided an __unset method, but by looking at the __isset method you can easily see how an undeclared variable might be unset.
You have __isset and __unset methods only because there are magic set and get methods. All in all, in most situations, it seems simpler to forget about using undeclared data members, and thereby do away with the need for magic set and get methods and their companion __isset and __unset methods.

The magic method __call is to undeclared methods what get and __set are
to undeclared data members. This is another magic method provided as a con- venience. At first, it is a little difficult to imagine what an undeclared method might be and what use it might have. Well, here’s one way that this method

can prove useful. Suppose you wanted to add to the functionality of the MySQLResultSet class defined in Chapters 9 and 10, so as to retrieve the current system status in this fashion:

//assume $rs is an instance of MySQLResultSet
$rs->stat();

You could just create a wrapper method for the existing MySQL function, mysql_stat, as you did when creating other methods of this class. For example, the existing getInsertId method simply encloses a call to mysql_insert_id. You could do exactly the same thing with mysql_stat. However, the more versatile option is to add a __call method similar to the following code:

public function __call($name, $args){
$name = "mysql_". $name(;
if(function_exists($name)){
return call_user_func_array($name, $args);
}
}

When you call the stat method against a MySQLResultSet object, the method name, stat, is passed to the __call method where mysql_ is prepended. The mysql_stat method is then invoked by the call_user_func_array function. Not only can you call the mysql_stat function, but once __call is defined you can call any MySQL function against a MySQLResultSet class instance by simply using the function name, minus the leading mysql_, and supplying any required arguments. This magic method does away with the need for writing wrapper methods for existing MySQL function, and allows them to be “inherited.” If you’re already familiar with the MySQL function names it also makes for easy use of the class.
However, this magic method is not quite as convenient as it might seem at first glance. Functions such as mysql_fetch_array that require that a result set resource be passed even though the class is itself a result set resource make nonsense of the whole notion of an object—why should an object need to pass a copy of itself in order to make a method call? On the other hand, this is an easy and natural way to incorporate functions such as mysql_stat
and mysql_errno that don’t require any arguments, or functions such as mysql_escape_string that require primitive data types as arguments. If properly used, this convenience method seems much more defensible than the __set and __get methods.

The __autoload function is a convenience that allows you to use classes without having to explicitly write code to include them. It’s a bit different from other magic methods because it is not incorporated into a class definition. It is simply included in your code like any other procedural function.

Normally, to use classes you would include them in the following way:

require 'MySQLResultSet.php'; require 'MySQLConnect.php'; require 'PageNavigator.php'; require 'DirectoryItems.php'; require 'Documenter.php';

These five lines of code can be replaced with the following:

function __autoload($class) {
require $class '.php';
}

The __autoload function will be invoked whenever there is an attempt to use a class that has not been explicitly included. The class name will be passed to this magic function, and the class can then be included by creating the filename that holds the class definition. Of course, to use __autoload as coded above, the class definition file will have to be in the current directory or in the include path.
Using autoload is especially convenient when your code includes numer- ous class files. There is no performance penalty to pay—in fact, there may be performance improvements if not all classes are used all the time. Use of the
autoload function also has the beneficial side effect of requiring strict naming conventions for files that hold class definitions. You can see from the previous code listing that the naming conventions used in this book (i.e., combining the class name and the extension .php to form the filename) will work fine with __autoload.

sleep and wakeup

These magic methods have been available since PHP 4 and are invoked by the variable handling functions serialize and unserialize. They control how an object is represented so that it can be stored and recreated. The way that you store or communicate an integer is fairly trivial, but objects are more com- plex than primitive data types. Just as the __toString method controls how an object is displayed to the screen, sleep controls how an object will be stored. This magic method is invoked indirectly whenever a call to the serialize function is made. Cleanup operations such as closing a database connection can be performed within the __sleep method before an object is serialized.
Conversely, __wakeup is invoked by unserialize and restores the object.

clone

Like the constructor, __clone is invoked by a PHP operator, in this case clone. This is a new operator introduced with PHP 5. To see why it is necessary, we need to take a look at how objects are copied in PHP 4.

In PHP 4 objects are copied in exactly the same way that regular variables are copied. To illustrate, let’s reuse the Person class shown in Listing 13-1 (see Listing 13-3).

$x = 3;
$y = $x;
$y = 4;
echo $x. '<br />';
echo $y. '<br />';
$obj1 = new Person();
$obj1->name = 'Waldo';
$obj2 = $obj1;
$obj2->name = 'Tom';
echo $obj1->name. '<br />';
echo $obj2->name;

Listing 13-3: Using the assignment operator under PHP 4

If the code in Listing 13-3 is run under PHP 4, the output will be as follows:

3
4
Waldo
Tom

The assignment of $obj1 to $obj2 ( ) creates a separate copy of a Person just as the assignment of $x to $y creates a separate integer container. Chang- ing the name attribute of $obj2 does not affect $obj1 in any way, just as changing the value of $y doesn’t affect $x.
In PHP 5, the assignment operator behaves differently when it is used with objects. When run under PHP 5, the output of the code in Listing 13-3 is the following:

3
4
Tom
Tom

For both objects the name attribute is now Tom.

Where’s Waldo?

In PHP 5, the assignment of one object to another creates a reference rather than a copy. This means that $obj2 is not an independent object but another means of referring to $obj1. Any changes to $obj2 will also change $obj1. Using the assignment operator with objects under PHP 5 is equivalent to assigning by reference under PHP 4. (You may recall our use of the assignment by ref- erence operator in Chapter 4.)

In other words, in PHP 5

//PHP 5
$obj2 = $obj1;

achieves the same result as

//PHP 4
$obj2 =& $obj1;

The same logic applies when an object is passed to a function. This is not surprising, because there is an implicit assignment when passing a variable to a function. Under PHP 4, when objects are passed to functions, the default is to pass them by value, creating a copy in exactly the same way as with any primi- tive variable. This behavior was changed in PHP 5 because of the inefficiencies associated with passing by value. Why pass by value and use up memory when, in most cases, all that’s wanted is a reference? To summarize, in PHP 5, when an object is passed to a function or when one object is assigned to another, it is assigned by reference. However, there are some situations where you do want to create a copy of an object and not just another reference to the same object. Hence the need to introduce the clone operator.

NOTE If you are porting PHP 4 code to a server running PHP 5, you can remove all those ungainly ampersands associated with passing an object by reference or assigning it by reference.

clone

To understand the clone operator, let’s use the Person class again, adding a few more lines of code to Listing 13-3 to create the code in Listing 13-4.

if ($obj1 === $obj2) {
echo '$obj2 equals $obj1.<br />';
}
$obj3 = clone $obj1; echo 'After cloning '; if ($obj1 === $obj3){
//this code will execute
echo '$obj3 equals $obj1.<br />';
}else{
echo '$obj3 does not equal $obj1.<br />';
}
$obj3->name = 'Waldo';
echo 'Here\'s '. $obj1->name. '.<br />';
echo 'Here\'s '. $obj3->name. '.<br />';
$obj2 equals $obj1
After cloning $obj3 does not equal $obj1.
Here's Tom. Here's Waldo.

Listing 13-4: Finding Waldo

Remember that in Listing 13-3 $obj1 was assigned to $obj2, so the identity test conducted here shows that they are equal. This is because $obj2 is a reference to $obj1. After $obj1 is cloned to create $obj3 in Listing 13-4, the test for identity produces a negative result.
The name attribute of your newly cloned object is changed, and the output shows that this change does not affect the original object. In PHP 5, cloning an object makes a copy of an object just as the assignment operator does in PHP 4.
You may have supposed that in our search for Waldo we lost sight of our ultimate goal. Not true. Now that you understand the clone operator, you can make sense of the __clone method. It is invoked in the background when an object is cloned. It allows you to fine-tune what happens when an object is copied. This is best demonstrated using an aggregate class as an example.

Aggregate Classes

An aggregate class is any class that includes a data member that is itself an object. Let’s quickly create a Team class as an example. This class has as a data member, an array of objects called players. The class definitions for the Player class and the Team class are shown in Listing 13-5.

class Player{ private $name; private $position;
public function __construct($name){
$this->name = $name;
}
public function getName(){
return $this->name;
}
public function setPosition($position){
$this->position = $position;
}
}
class Team{
private $players = array();
private $name;
public function __construct($name){
$this->name = $name;
}
public function addPlayer(Player $p){
$this->players[] = $p;
}
public function getPlayers(){
return $this->players;
}
public function getName(){
return $this->name;
}

public function setName($name){
$this->name = $name;
}
}

Listing 13-5: The Team aggregate class

Let’s create a player, add him to a team, and see what happens when you clone that object (see Listing 13-6).

$rovers = new Team('Rovers');
$roy = new Player('Roy');
$roy->setPosition('striker');
$rovers->addPlayer($roy);
$reserves = clone $rovers;
$reserves->setName('Reserves');
//changes both with __clone undefined
$roy->setPosition('midfielder'); echo $rovers->getName(). ' '; print_r($rovers->getPlayers()); echo '<br /><br />';
echo $reserves->getName(). ' ';
print_r($reserves->getPlayers());

Listing 13-6: Cloning an aggregate object

Setting a player’s position after the clone operation changes the value of position for the player in both objects. Outputting the players array proves this—Roy’s position is the same for both objects (see Listing 13-7).

Rovers Array ( [0] => Player Object ( [name:private] => Roy [position:private]
=> midfielder ) )
Reserves Array ( [0] => Player Object ( [name:private] => Roy
[position:private] => midfielder ) )

Listing 13-7: Undesired result of cloning

Because player is an object, the default behavior when making a copy is to create a reference rather than an independent object. For this reason, any change to an existing player affects the players array for both Team instances. This is known as a shallow copy and in most cases doesn’t yield the desired result. The magic clone method was introduced in order to deal with situa- tions such as this. Let’s add a __clone method to the Team class so that each team has a separate array of players. The code to do this is as follows:

public function __clone(){
$newarray = array();
foreach ($this->players as $p){
$newarray[] = clone $p;
}
$this->players = $newarray;
}

While looping through the array of players each individual player
is cloned and added to a new array. Doing this creates a separate array for the cloned team. After making these changes, running the code in Listing 13-6 now yields this output:

Rovers Array ( [0] => Player Object ( [name:private] => Roy [position:private]
=> midfielder ) )
Reserves Array ( [0] => Player Object ( [name:private] => Roy
[position:private] => striker ) )

Changing a player originally added to the $rovers has no effect on the player array in the cloned $reserves object. This is the result you want. The magic clone method allows you to define what happens when an object is cloned. Using the terminology of other OO languages, the __clone method is
a copy constructor. It’s up to the developer to decide what’s appropriate for any particular class, but as a general rule, a __clone method should always be defined for aggregate classes for the exact reasons shown in the sample code—normally the same variable isn’t shared among different instances of a class. (That’s what static data members are for.)

A Get Method for Object Data Members of an Aggregate Class

When I first introduced accessor methods in Chapter 5, I noted that one of the advantages of a get method over direct access to a public data member was that accessor methods return a copy rather than an original. This is not true when the data members are themselves objects—by default objects are returned by reference. In the interests of data protection, it is usually better to return a copy using the clone operator. With this in mind, let’s rewrite the getPlayers method originally shown in Listing 13-5, shown here in Listing 13-8.

public function getPlayers(){
$arraycopy = array();
foreach ($this->players as $p){
$arraycopy[] = clone $p;
}
return $arraycopy;
}

Listing 13-8: Returning a copy of an object

The array returned by this version of the getPlayers method is only a copy so changes made to it will have no effect on the data member, $players. If you need to make changes to the players array a set method will have to be written. Doing so is a fairly straightforward matter so I’ll leave that up to you.
The fact that objects are passed by reference also has implications for how objects are added to an aggregate class. For instance, consider the player Roy who is added to the team Rovers in Listing 13-6. Any changes made to the variable $roy will change the first element of the $players array in the $rovers object. This may or may not be what’s wanted. If not, then players should be cloned before being added to the players array.

The addPlayer method of the Team class could be rewritten as:

public function addPlayer(Player $p){
$newplayer = clone $p;
$this->players[] = $newplayer;
}

The Team class now has complete control over any player added. This will doubtless be the implementation preferred by any Team manager.

NOTE The fact that PHP 5 returns a reference to an object rather than a copy may have serious implications for aggregate objects written under PHP 4 and running under PHP 5. Objects formerly returned by value will now be returned by reference, breaking
encapsulation.

No Clones Allowed

As you saw when we discussed the singleton class in Chapter 11, in some cases it may make sense to disallow copies altogether. This can be done by imple- menting code such as the following:

public final function __clone(){
throw new Exception('No clones allowed!');
}

Making this method final ensures that this behavior can’t be overridden in derived classes, and making it public displays a clear error message when there is an attempt at cloning. Making the __clone method private would also disallow cloning but displays a less explicit message: Access level must be public.

A Note About Overloading

On the PHP website the __set, __get, and __call methods are referred to as overloaded methods. Within the context of OOP, an overloaded method is one that has the same name as another method of the same class but differs in the number or type of arguments—methods with the same name but a differ- ent “signature.” Because PHP is a typeless language and doesn’t really care how many arguments are passed, this kind of overloading is an impossibility. In PHP, overloading usually refers to methods that perform a variety of differ- ent tasks.
Languages such as C++ also support something called operator overloads. For example, a programmer can define what the “greater than” operator means when two objects of the same class are compared. Support for such features has been described as “syntactic sugar” because, most of the time, operator overloads are not strictly necessary—the same effect could be

achieved by writing a method rather than overloading an operator. How- ever, operator overloads can be convenient and intuitive. It makes more sense to write:

if($obj1 > $obj2)

than

if($obj1->isGreaterThan($obj2))

The __toString method, while it is not an operator overload, offers a con- venience similar to that of an operator overload. It is certainly nice to be able to control what is displayed to the screen when an object is echoed.
PHP supports operator overloading for clone and, as you have seen, this is not syntactic sugar but a matter of necessity. The same can be said of the
__sleep and __wakeup methods because, as with destructors, the circumstances under which these methods are invoked aren’t always under the direct control of the developer.
All other magic methods are there for convenience so would perhaps qualify as “syntactic sugar.” I won’t repeat the opinions expressed earlier about __set and __get or press the point. After all, PHP places a high premium on user convenience, aiming to get the job done quickly and easily. Undoubt- edly, this is the reason for its success. If PHP’s aim was language purity, OO or otherwise, it probably wouldn’t still be with us.

0 comments: