Tuesday, July 14, 2009

AJAX and PHP Building Responsive Web Applications by Cristian Darie, Bogdan Brinzarea Chapter 9

AJAX RSS Reader

In the last few years, the Web has become much more active than it used to be. Today, we see an explosion of new sources of information, such as news sites appearing every day (such as http://www.digg.com and http://www.newsvine.com), and the growing trend of web life— weblogs (every person seems to have a weblog these days).

As a natural reaction to this invasion of information, many systems that allow grouping, filtering, and aggregating this information have appeared. This is implemented in practice through web syndication, which is that form of syndication where parts of a website (such as news, weblog posts, articles, and so on) are made available for other sites or applications to use.

In order to be usable by other parties, the data to be shared must be in a generic format that can be laid out in different formats than in the original source, and when it comes to such formats, RSS
2.0 and Atom are the most popular choices.

Learn more about the history of RSS and Atom in the Wikipedia—the link to the RSS page is
http://en.wikipedia.org/wiki/RSS_(protocol).

In this chapter, we'll analyze the RSS file format, then take a look at Google Reader (Google's
RSS aggregator), and then build our own RSS aggregator web page with AJAX and PHP.

Working with RSS
RSS is a widely used XML-based standard, used to exchange information between applications on the Internet. One of the great advantages of XML is that it is plain text, thus easily read by any application. RSS feeds can be viewed as plain text files, but it doesn't make much sense to use them like that, as they are meant to be read by specialized software that generates web content based on their data.

While RSS is not the only standard for expressing feeds as XML, we've chosen to use this format in the case study because it's very widely used. In order to better understand RSS, we need to see what lies underneath the name; the RSS document structure, that is.

The RSS Document Structure
The first version of RSS was created in 1999. This is known as version 0.9. Since then it has evolved to the current 2.0.1 version, which has been frozen by the development community, as future development is expected to be done under a different name.

A typical RSS feed might look like this:

<rss version="2.0">
<channel>
<title>CNN.com</title>
<link>http://www.example.org</link>
<description>A short description of this feed</description>
<language>en</language>
<pubDate>Mon, 17 Oct 2005 07:56:23 EDT</pubDate>
<item>
<title>Catchy Title</title>
<link>http://www.example.org/2005/11/catchy-title.html</link>
<description>
The description can hold any content you wish, including XHTML.
</description>
<pubDate>Mon, 17 Oct 2005 07:55:28 EDT</pubDate>
</item>
<item>
<title>Another Catchy Title</title>
<link>http://www.example.org/2005/11/another-catchy-title.html</link>
<description>
The description can hold any content you wish, including XHTML.
</description>
<pubDate>Mon, 17 Oct 2005 07:55:28 EDT</pubDate>
</item>
</chanel>
</rss>

The feed may contain any number of <item> items, each item holding different news or blog entries or whatever content you wish to store.

This is all plain text, but as we stated above, we need special software that will parse the XML and return the information we want. An RSS parser is called an aggregator because it can usually extract and aggregate information from more than one RSS source.

Such an application is Google Reader, an online service from Google, launched in fall 2005. A
veteran web-based RSS reader service is the one at http://www.bloglines.com.

Google Reader
Google Reader (http://reader.google.com) provides a simple and intuitive AJAX-enabled interface that helps users keep track of their RSS subscriptions and reading. It hasn't been long since this service was launched (it's still in beta at the moment of writing), but it has already got a great deal of attention from users. Figure 9.1 shows the Google Reader in action, reading a news item from Packt Publishing's RSS feed.

Figure 9.1: Managing RSS Subscriptions (Feeds) on Google Reader

Implementing the AJAX RSS Reader
In order for this exercise to function correctly, you need to enable XSL support in your PHP
installation. Appendix A contains installation instructions that include XSL support.

In the exercise that will follow we will build our own AJAX-enabled RSS reader application. The main characteristics for the application are:
1. We'll keep the application simple. The list of feeds will be hard-coded in a PHP file on the server.
2. We'll use XSLT to transform the RSS feed data into something that we can display
to the visitor. In this chapter, the XSL transformation will be performed on the server side, using PHP code.
3. We'll use the SimpleXML library to read the XML response from the news server.
SimpleXML was introduced in PHP 5, and you can find its official documentation at http://php.net/simplexml. SimpleXML is an excellent library that can make reading XML sources much easier than using the DOM.

4. The application will look like Figure 9.2:

Figure 9.2: Our AJAX-enabled RSS Reader Start Page

Feeds are loaded dynamically and are displayed as links in the left column. Clicking on a feed will trigger an HTTP request and the server script will acquire the desired RSS feed.

The server then formats the feed with XSL and returns an XML string. Results are then displayed in a human-readable form.

Time for Action—Building the RSS Reader Application
1. In your ajax folder, create a new folder named rss_reader.
2. Let's start with the server. Create a new file named rss_reader.php, and add this code to it:
<?php
// load helper scripts
require_once ('error_handler.php');
require_once ('rss_reader.class.php');
// create a new RSS Reader instance
$reader = new CRssReader(urldecode($_POST['feed']));
// clear the output if(ob_get_length()) ob_clean();
// headers are sent to prevent browsers from caching
header('Expires: Fri, 25 Dec 1980 00:00:00 GMT'); // time in the past header('Last-Modified: ' . gmdate( 'D, d M Y H:i:s') . 'GMT'); header('Cache-Control: no-cache, must-revalidate');

header('Pragma: no-cache');
header('Content-Type: text/xml');
// return the news to the client echo $reader->getFormattedXML();
?>

3. Create a new file named rss_reader.class.php, and add this code to it:
<?php
// this class retrieves an RSS feed and performs a XSLT transformation
class CRssReader
{
private $mXml;
private $mXsl;

// Constructor - creates an XML object based on the specified feed function __construct($szFeed)
{
// retrieve the RSS feed in a SimpleXML object
$this->mXml = simplexml_load_file(urldecode($szFeed));
// retrieve the XSL file contents in a SimpleXML object
$this->mXsl = simplexml_load_file('rss_reader.xsl');
}

// Creates a formatted XML document based on retrieved feed public function getFormattedXML()
{
// create the XSLTProcessor object
$proc = new XSLTProcessor;
// attach the XSL
$proc->importStyleSheet($this->mXsl);
// apply the transformation and return formatted data as XML string return $proc->transformToXML($this->mXml);
}
}
?>

4. Create a new file named rss_reader.xsl, and add this code to it:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<dl>
<xsl:for-each select="rss/channel/item">
<dt><h3><xsl:value-of select="title" /></h3></dt>
<dd>
<span><xsl:value-of select="pubDate" /></span>
<p>
<xsl:value-of select="description" />
<br />
<xsl:element name="a">
<xsl:attribute name = "href">
<xsl:value-of select="link" />
</xsl:attribute>
read full article
</xsl:element>
</p>
</dd>
</xsl:for-each>
</dl>
</xsl:template>
</xsl:stylesheet>

5. Now add the standard error-handling file, error_handler.php. Feel free to copy this file from the previous chapter. Anyway, here's the code for it:

<?php
// set the user error handler method to be error_handler
set_error_handler('error_handler', E_ALL);

// error handler function
function error_handler($errNo, $errStr, $errFile, $errLine)
{
// clear any output that has already been generated
if(ob_get_length()) ob_clean();
// output the error message
$error_message = 'ERRNO: ' . $errNo . chr(10) .
'TEXT: ' . $errStr . chr(10) .
'LOCATION: ' . $errFile .
', line ' . $errLine;
echo $error_message;
// prevent processing any more PHP scripts exit;
}
?>

6. In the rss_reader folder, create a file named config.php, where we'll add the feeds our application will aggregate.
<?php
// Set up some feeds
$feeds = array ('0' => array('title' => 'CNN Technology',
'feed' =>
'http://rss.cnn.com/rss/cnn_tech.rss'),
'1' => array('title' => 'BBC News',
'feed' =>
'http://news.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml'),
'2' => array('title' => 'Wired News',
'feed' =>
'http://wirednews.com/news/feeds/rss2/0,2610,3,00.xml'));
?>

7. Create a new file named index.php, and add this code to it:
<?php
// load the list of feeds
require_once ('config.php');
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>AJAX RSS Reader</title>
<link rel="stylesheet" type="text/css" href="rss_reader.css"/>
<script src="rss_reader.js" type="text/javascript"></script>
</head>
<body>
<h1>AJAX RSS Reader</h1>
<div id="feeds">
<h2>Feeds</h2>
<ul id="feedList">
<?php
// Display feeds
for ($i = 0; $i < count($feeds); $i++)
{
echo '<li id="feed-' . $i . '"><a href="javascript:void(0);" ';
echo 'onclick="getFeed(document.getElementById(\'feed-' . $i .
'\'), \'' . urlencode($feeds[$i]['feed']) . '\');">';
echo $feeds[$i]['title'] . '</a></li>';
}
?>

</ul>
</div>
<div id="content">
<div id="loading" style="display:none">Loading feed...</div>
<div id="feedContainer" style="display:none"></div>
<div id="home">
<h2>About the AJAX RSS Reader</h2>
<p>
The AJAX RSS reader is only a simple application that provides basic functionality for retrieving RSS feeds.
</p>
<p>
This application is presented as a case study in
<a href="https://www.packtpub.com/ajax_php/book"> Building
Responsive Web Applications with AJAX and PHP</a> (Packt Publishing, 2006).
</p>
</div>
</div>
</body>
</html>

8. Create a new file named rss_reader.js, and add this code to it:
// holds an instance of XMLHttpRequest
var xmlHttp = createXmlHttpRequestObject();
// when set to true, display detailed error messages var showErrors = true;

// creates an XMLHttpRequest instance function createXmlHttpRequestObject()
{
// will store the reference to the XMLHttpRequest object var xmlHttp;
// this should work for all browsers except IE6 and older try
{
// try to create XMLHttpRequest object xmlHttp = new XMLHttpRequest();
}
catch(e)
{
// assume IE6 or older
var XmlHttpVersions = new Array("MSXML2.XMLHTTP.6.0", "MSXML2.XMLHTTP.5.0", "MSXML2.XMLHTTP.4.0", "MSXML2.XMLHTTP.3.0", "MSXML2.XMLHTTP", "Microsoft.XMLHTTP");
// try every prog id until one works
for (var i=0; i<XmlHttpVersions.length && !xmlHttp; i++)
{
try
{
// try to create XMLHttpRequest object
xmlHttp = new ActiveXObject(XmlHttpVersions[i]);
}
catch (e) {} // ignore potential error
}
}
// return the created object or display an error message
if (!xmlHttp)
alert("Error creating the XMLHttpRequest object.");
else

return xmlHttp;
}

// function that displays an error message function displayError($message)
{
// ignore errors if showErrors is false if (showErrors)
{
// turn error displaying Off showErrors = false;
// display error message
alert("Error encountered: \n" + $message);
}
}

// Retrieve titles from a feed and display them function getFeed(feedLink, feed)
{
// only continue if xmlHttp isn't void
if (xmlHttp)
{
// try to connect to the server
try
{
if (xmlHttp.readyState == 4 || xmlHttp.readyState == 0)
{
/* Get number of feeds and loop through each one of them to change the class name of their container (<li>). */
var numberOfFeeds =
document.getElementById("feedList").childNodes.length;
for (i = 0; i < numberOfFeeds; i++)
document.getElementById("feedList").childNodes[i].className = "";
// Change the class name for the clicked feed so it becomes
// highlighted
feedLink.className = "active";
// Display "Loading..." message while loading feed document.getElementById("loading").style.display = "block";
// Call the server page to execute the server-side operation params = "feed=" + feed;
xmlHttp.open("POST", "rss_reader.php", true);
xmlHttp.setRequestHeader("Content-Type",
"application/x-www-form-urlencoded");
xmlHttp.onreadystatechange = handleHttpGetFeeds;
xmlHttp.send(params);
}
else
{
// if connection was busy, try again after 1 second setTimeout("getFeed('" + feedLink + "', '" + feed + "');", 1000);
}
}
// display the error in case of failure
catch (e)
{
displayError(e.toString());
}
}
}

// function that retrieves the HTTP response function handleHttpGetFeeds()
{

// continue if the process is completed if (xmlHttp.readyState == 4)
{
// continue only if HTTP status is "OK" if (xmlHttp.status == 200)
{
try
{
displayFeed();
}
catch(e)
{
// display error message displayError(e.toString());
}
}
else
{
displayError(xmlHttp.statusText);
}
}
}

// Processes server's response function displayFeed()
{
// read server response as text, to check for errors var response = xmlHttp.responseText;
// server error?
if (response.indexOf("ERRNO") >= 0
|| response.indexOf("error:") >= 0
|| response.length == 0)
throw(response.length == 0 ? "Void server response." : response);
// hide the "Loading..." message upon feed retrieval document.getElementById("loading").style.display = "none";
// append XSLed XML content to existing DOM structure
var titlesContainer = document.getElementById("feedContainer");
titlesContainer.innerHTML = response;
// make the feed container visible document.getElementById("feedContainer").style.display = "block";
// clear home page text
document.getElementById("home").innerHTML = "";
}

9. Create a new file named rss_reader.css, and add this code to it:
body
{
font-family: Arial, Helvetica, sans-serif;
font-size: 12px;
}

h1
{
color: #ffffff;
background-color: #3366CC;
padding: 5px;
}

h2
{
margin-top: 0px;
}

h3
{

margin-bottom: 0px;
}

li
{
margin-bottom: 5px;
}

div
{
padding: 10px;
}

a, a:visited
{
color: #3366CC;
text-decoration: underline;
}

a:hover
{
color: #ffffff;
background-color: #3366CC;
text-decoration: none;
}

.active a
{
color: #ffffff;
background-color: #3366CC;
text-decoration: none;
}

.active a:visited
{
color: #ffffff;
background-color:#3366CC;
text-decoration:none;
}

.active a:hover
{
color:#ffffff;
background-color: #3366CC;
text-decoration: none;
}

#feeds
{
display: inline;
float: left;
width: 150px;
background-color: #f4f4f4;
border:1px solid #e6e6e6;
}

#content
{
padding-left:170px;
border:1px solid #f1f1f1;
}

#loading
{
float: left;
display: inline;

width: 410px;
background-color: #fffbb8;
color: #FF9900;
border: 1px solid #ffcc00;
font-weight: bold;
}

.date
{
font-size: 10px;
color: #999999;
}

10. Load http://localhost/ajax/rss_reader in your web browser. The initial page should look like Figure 9.3. If you click one of the links, you should get something like Figure 9.2.

Figure 9.3: The First Page of the AJAX RSS Reader

What Just Happened?
It's not a really professional application at this state, but the point is proven. It doesn't take much code to accomplish such a result and any features you might think of can be added easily.

The user interface of this application is pretty basic, all set up in index.php. We first need to include config.php—where our feeds are defined, in order to display the list of feeds on the left panel. Feeds are defined as an associative array of arrays. The main array's keys are numbers starting from 0 and its values are arrays, with keys being the feeds' titles and values being the feeds' URLs. The $feeds array looks like this:

$feeds = array ("0" => array("title" => "CNN Technology",
"feed" => "http://rss.cnn.com/rss/cnn_tech.rss"),

"1" => array("title" => "BBC News", "feed" =>
"http://news.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml"), "2" => array("title" => "Wired News",
"feed" => "http://wirednews.com/news/feeds/rss2/0,2610,3,00.xml"));

Translated into a more meaningful form, this is how the $feeds array looks like:

ID Feed Title (title) Feed URL (feed)

0 CNN Technology

http://rss.cnn.com/rss/cnn_tech.rss

1 BBC News http://news.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml

2 Wired News http://wirednews.com/news/feeds/rss2/0,2610,3,00.xml

We have decided to store the feeds like this for simplicity, but it's easy to extend the code and store them in a database, if you need to.

In index.php we loop through these feeds and display them all as an un-ordered list, each feed being a link inside an <li> element. We assign each link an onclick event function where getFeed function will be called. This function takes two parameters: the <li>'s ID and the feed's URL. We need the ID in order to highlight that link in the list and we need the feed's URL to send it as a parameter in our HTTP request to the server. The urlencode function ensures that the URL is safely sent to the server, which will use urldecode to decode it.

Two more things about index.php:

• Initially hidden, the <div> with id="loading" will be displayed while retrieving the feed, to inform the user that the feed is loading. This is useful when working with a slow connection or with slow servers, when the retrieval time will be long.
<div id="loading" style="display:none">Loading feed...</div>

• The <div> with id="feedContainer" is the actual container where the feed will be loaded. The feed will be dynamically inserted inside this div element.

<div id="feedContainer"></div>

rss_reader.js contains the standard XMLHttpRequest initialization, request sending, and response retrieval code. The getFeed function handles the sending of the HTTP request. First it loops through all feed links and un-highlights the links by setting their CSS class to none. It then highlights the active feed link:

/* Get number of feeds and loop through each one of them to change the class name of their container (<li>). */
var numberOfFeeds =
document.getElementById("feedList").childNodes.length;
for (i = 0; i < numberOfFeeds; i++)
document.getElementById("feedList").childNodes[i].className = "";
// Change the class name for the clicked feed to highlight it feedLink.className = "active";

OK, the next step is to display the Loading feed... message:

// Display "Loading..." message while loading feed document.getElementById("loading").style.display = "block";

And finally, we send the HTTP request with the feed's title as parameter:

// Call the server page to execute the server-side operation params = "feed=" + feed;
xmlHttp.open("POST", "rss_reader.php", true);
xmlHttp.setRequestHeader("Content-Type",
"application/x-www-form-urlencoded"); xmlHttp.onreadystatechange = handleHttpGetFeeds; xmlHttp.send(params);

The rss_reader.php script creates an instance of the CRssReader class and displays an
XSL-formatted XML document, which is returned back to the client. The following lines do the hard work (the code that clears the output and prevents browser caching was stripped):

$reader = new CRssReader(urldecode($_POST['feed']));
echo $reader->getFormattedXML();

CRssReader is defined in rss_reader.class.php. This PHP class handles XML retrieval and formatting. Getting a remote XML file is a piece of cake with PHP 5's new extension: SimpleXML. We'll also load the XSL template and apply it to the retrieved XML.

The constructor of this class retrieves the XML and saves it in a class member named $mXml and the XSL file in a class member named $mXsl:

// Constructor - creates an XML object based on the specified feed function __construct($szFeed)
{
// retrieve the RSS feed in a SimpleXML object
$this->mXml = simplexml_load_file(urldecode($szFeed));
// retrieve the XSL file contents in a SimpleXML object
$this->mXsl = simplexml_load_file('rss_reader.xsl');
}

The getFormattedXML() function creates a new XSLTProcessor object in order to apply the XSL transformation. The transformToXML method simply returns a formatted XML document, after the XSL has been applied.

// Creates a formatted XML document based on retrieved feed public function getFormattedXML()
{
// create the XSLTProcessor object
$proc = new XSLTProcessor;
// attach the XSL
$proc->importStyleSheet($this->mXsl);
// apply the transformation and return formatted data as XML string return $proc->transformToXML($this->mXml);
}

What we need to accomplish with XSL is to loop through each "record" of the XML and display the data inside. A record is delimited by <item> and </item> tags.

In rss_reader.xsl we define a loop like this:

<xsl:for-each select="rss/channel/item">

For example, to display the current title, we write:

<h3><xsl:value-of select="title" /></h3>

Notice how we create a new <a> element with XSLT:

<xsl:element name="a">
<xsl:attribute name = "href">
<xsl:value-of select="link" />
</xsl:attribute>
read full article
</xsl:element>

We use this technique to build links to full articles on their actual websites.

There's also a bit of CSS code that will format the output according to our wish. Everything should be pretty clear if you take a quick look at rss_reader.css.

Summary
Today's Web is different than yesterday's Web and tomorrow's Web will certainly be different than today's. Yesterday's Web was a collection of pages linked together. All static, and everybody kept things for themselves. The main characteristic of today's Web is information exchange between websites and/or applications.

Based on what you've learned in this chapter, you'll be able to build an even better RSS Reader, but why stop here? You hold some great tools that allow you to build great applications that could impact on tomorrow's Web!

3 comments:

Marine said...

Hello,

I got the book and I am stuggling. I want to display the full text and not just the description. Any idea of how I could do that?

Thanks,

-M

Kest said...

Hello, I don't quite understand what you need. Could you describe your request more?

Marine said...

I wanted to display the full text of each feed. I did some research. So I am using some website to convert the feeds into full text feeds. But I'd like a way to be able to do this conversion in the behind the scene of the application instead of having to manually use some 3rd party site to make the conversion.

http://rss.marineboudeau.com :feed #1 is summary only. feed #2 and #3 are full text feeds.

Hope this is clearer.