Building an Asynchronous Portal using Lightweight HTML Injection
ZOIS Technical Note TN-2011-07-01.
Author and Audience
The aggregation nature of some web-pages makes for
'Portals' which are often unsatisfactory beasts in that they are can
appear complex and slow to load. This Technical Note presents one
such solution using some asynchronous Javascript
Techniques. Importantly is also presents a fall-back solution for when
Javascript is not available. It is anticipated that the audience would
be familiar with how the web works, Javascript, PHP and programming in
general. Written by Martin Sullivan[au],
ZOIS Limited, Cockermouth.
Abstract
The underlying mechanisms behind the
RSS
Asynchronous Portal Example are discussed. It uses Asynchronous
Javascript technology, but has a fallback mechanism using a 'refresh
hack' for non-Javascript use. It was inevitable, given feedback from
the rest of the site, that the example would feature Cockermouth and
the Jobcentre Database Mirror.
Introduction
The author has at various times been involved in 'Portal' projects. They have used various technologies to provide a unified central page which contained links to diverse content, usually held at some geographically remote web-site. The content for such pages then comes from a number of locations and servers both inside and outside the location. The aesthetics of these Portal pages tends to be rather 'busy' with a predominately boxy look-and-feel. They load slowly as data has to be retrieved from a number of databases over sometimes slow links. Traditionally, the assembled page cannot be rendered until all of the components have been acquired.
While these kinds of systems have been studied intensively,
resulting in a number of standardised mechanism[jp],
the author was intrigued by somewhat lighter JASON-like approaches
found around the web. Some systems have been built as experiments and
demonstrators. These can be currently found on the Home site-server[cr].
Materials and Platform
The original, more modest, plan was to demonstrate how an Real
Simple Syndication (RSS)[rs] feed could be
integrated into a localised home page with the aim of increasing the
use of the RSS feeds provided by the Jobcentre Plus Mirror[nj] system. RSS feeds seem to be increasingly popular
and they provide a good mechanism to allow third-parties to produce
localised pages of which the latest postings at the local Jobcentre
could be a part. The system then uses an XML using RSS Schema with the
necessary manipulations are performed in PHP[ph] and
the pages displayed with CSS[cs] stylings borrowed
from the existing ZOIS site. The critical asynchronous
component is provided by the Javascript[js]
XMLHttpRequest[hr] function.
Although laudably modest at outset, another aim imposed itself. The
technology used is Javascript, which as a downloaded executable code
presents a not inconsiderable security risk. Such are the risks that
many users tend to switch it off, or use a selective blocking tool
such as NoScript. While the original designers of Javascript, no doubt,
intended it be used as a useful adjunct, sadly now seems to be core to
many web-sites. Such web-sites offer a poor experience when Javascript
is turned off. They will often fail completely or offer only a rather
rude suggestion that one should switch-on Javascript if one is to
enter their on-line shop or read their important diatribe. It was
decided that demonstrator should not display such behaviour, but
instead degrade to a non-Javascript mechanism that emulated the
asynchronous behaviour produced by using Javascript to some useful
degree.
Method
Much of this methodology is a kind of AJAX-light. While 'proper' AJAX[ax] manipulates the Document Object Model (DOM), for which there is a Javascript Application Programming Interface. The DOM is a kind of internal database of the document that is rendered through the HTML layout engine, but its manipulation is, in the author's opinion, complicated. The techniques in described in this note use a simpler mechanism of direct-injection of HTML into an existing page. These techniques for the core of the "Asynchronous HTML and HTTP" mechanism, known as AHAH[ah]. While AHAH is not without its critics, mainly on the ground of code-presentation separation and performance, it is thought to be the best technique to use in this case.
The Injection Function
The functionality of the site is achieved by 'injecting' HTML
asynchronously into an existing HTML element having obtained it using
Javascript's' XMLHttpRequest. Since the initial data is
encoded as an RSS feed, in XML, it is necessary to obtain it and
convert it to an 'inject-able' bit of HTML.
function inject ($url, $items, $thats_it = YES) {
$opts = array (
'cookiesession' => YES,
'redirect' => 3,
'timeout' => 180,
'useragent' => "ZOIS RSS " .
"http://" . $_SERVER['HTTP_HOST'] . "/" .
$_SERVER['PHP_SELF'] . " PECL::HTTP (PHP)");
$c = http_parse_message (http_get ($url, $opts))->body;
// cache this stuff in a future release
$t = get_thing ('title', $c);
echo "<div id=\"rssinject\"><h3>$t</h3>\n";
$t = get_thing ('description', $c);
echo "<p>$t\n";
$count = preg_match_all ('/<item>(.*?)<\/item>/si', $c,
&$matches);
$number = $items < $count ? $items : $count;
echo "<p>\n";
for ($ix = 0; $ix < $number; $ix++)
display_entry ($matches[1][$ix]);
echo "</div><!-- rssinject -->\n";
if ($thats_it) // it's recursive HTML injection, so
exit; // promptly, so they don't get all the other
// guff.
} // inject
As will be elucidated, this function is designed to be used in two
slightly different ways. The first is a recursive function that only
returns the HTML that is going to be injected by Javascript
dynamically and the second is when it is being used to populate a
rather more conventional static page. In the first instance
$thats_it is true, so we return promptly, else return to
the calling function for further processing.
This function uses get_thing and
display_entry. These function extract text from the XML
and display it in HTML respectively. Here's the code for
get_thing:
function get_thing ($thing, $entry) {
preg_match ('/<' . $thing . '>(.*?)<\/' . $thing . '>/si',
$entry,
&$matches);
$r = preg_replace ('/<!\[CDATA\[(.*?)\]\]>/s', '$1', $matches[1]);
// remove CDATA protection
return (preg_replace ('/<script[^>]*>.*?<\/script>/si', '', $r));
// But defang any XSS
} // get_thing
Which is fairly self explanatory, but we take care to defang anything which may appear dubious. The sites that RSS is obtained for the demonstration are largely trusted, but it is wise to be cautious.
This is the code for display_entry:
function display_entry ($entry = null) {
if ($entry == null)
return;
$title = get_thing ('title', $entry);
$date = get_thing ('pubdate', $entry);
$description = get_thing ('description', $entry);
$link = get_thing ('link', $entry);
echo "<h4><a href=\"$link\">$title</a></h4>\n";
echo "<p>$description\n";
echo "<em>$date</em>";
} // display_entry
Which has various chunks of text extracted from the RSS stream and re-written in HTML. These are further styled by Cascading Style Sheets (CSS).
Getting the HTML and Injecting It
The asynchronous methodology uses Javascript. The guts of this is
the function multi_ahah. It is examined in detail in this
section.
var req = new Array();
function multi_ahah(url, target_id, announce) {
if (document.getElementById(target_id).innerHTML !=
"Loading ... " + announce) {
return;
} // if
The HTML fragment as an initial 'Loading ...' text. This serves two purposes; notifying the user that something is about to happen and as a placeholder for the injected HTML. Should the initial HTML fragment not be present then it is assumed that the desired HTML has already been injected.
if (window.XMLHttpRequest) {
req[target_id] = new XMLHttpRequest();
req[target_id].onreadystatechange = function() {
multi_ahahDone(target_id);
};
req[target_id].open("GET", url, true);
req[target_id].send(null);
} //if
} // multi_ahah
A request is generated and a 'call-back' registered, in this case,
multi_ahahDone.
function multi_ahahDone(target_id) {
if (req[target_id].readyState == 4) { // only if req is "loaded"
if (req[target_id].status == 200 ||
req[target_id].status == 304) { // only if "OK"
results = req[target_id].responseText;
document.getElementById(target_id).innerHTML =
results;
} else {
document.getElementById(target_id).innerHTML =
"<h4>Ahah error:\n" +
req[target_id].statusText + "</h4>";
} // else
} // if
} // multi_ahahDone
For readers of other Notes on this site this code might seem familiar[qd], and indeed it should be. This is a direct evolution of the Javascript that powers the Bubble of Further Information effect on part of the Office Detail pages of the Jobcentre Database Mirror[nj]. It is documented elsewhere, but the observant will also note that this code has been adapted to track multiple requests. Since the HTML injection is straight forward, directly into the page, there's no fancy animation required. The code that invokes this is started directly the page is displayed, with the 'Loading ...' text.
<script type="text/javascript">
document.write ('<div id="feed-0">Loading ... Jobcentre Plus latest for Cockermouth</div>');
multi_ahah ('/crsse.php?url=http%3A%2F%2Fhome.zois.co.uk%2Fjcprss.php&items=5', 'feed-0', 'Jobcentre Plus latest for Cockermouth');
</script>
Just to complicate matters even further, the above fragment is
automatically generated by a PHP script, so the div
id is unique and the URL and item count can be retrieved
from a fairly central place, in this case a PHP array. The
Javascript's URL fragment is self-referential, and when the PHP script
receives the appropriate arguments it knows to invoke the
inject function, discussed above.
$url = array_key_exists ("url", $_GET) ? $_GET["url"] : NULL;
$items = array_key_exists ("items", $_GET) ? $_GET["items"] : 0;
// Is it a recursive injection for AJAXy stuff?
if (isset ($url) && $items > 0)
inject ($url, $items); // produce the 'inner HTML', doesn't
// return
Simulation in the Absence of Javascript
Much of the rest of the code is concerned with generating an appropriately pretty and informative container for this Javascript based example. The question then arose, what to do in the absence of Javascript, if it is turned off or not available in the browser.
The mechanism chosen was the automated update. In this technique a 'holding page' is displayed while an outstanding request to the back-end server is run, when that page is completed it is displayed in the holding page's stead. The technique works best if the holding page and the completed pages are largely similar, with respect to static text, pictures and decoration. While the technique can be applied to a graded multiple-update approach, in this instance only one final update was used.
Firstly, the PHP code needs to realise that the browser does not support Javascript, or that it has been switched off.
echo "<noscript>\n"; echo "<meta http-equiv=\"Refresh\" content=\"0;"; echo $_SERVER['PHP_SELF']; echo "?noscript=1"; echo "\">\n"; echo "</noscript>\n";
This code causes the page to refresh immediately, with the same calling URL, but with "noscript=1" appended to it. The noscript value indicates to the back-end that we've displayed a place-holder and now the real page should be constructed. When this page is ready it will replace the currently displayed one.
The above code needs to be bracketed, to stop the page being constantly refreshed.
if (!$noscript) {
echo "<noscript>\n";
echo "<meta http-equiv=\"Refresh\" content=\"0;";
echo $_SERVER['PHP_SELF'];
echo "?noscript=1";
echo "\">\n";
echo "</noscript>\n";
} // noscript
The $noscript variable having been set thusly:
if (!isset ($noscript)) // if not already set
$noscript = array_key_exists ("noscript", $_GET) ?
$_GET["noscript"] : NO;
Elsewhere in the code, the inject function can now be
used to acquire the RSS XML, convert it to HTML and display it
normally.
if ($noscript) // called again with noscript option echo inject ($feed[0], $feed[1], NO);The original page will have a placeholder at this point ...
else { // something to look at, while we wait
echo "<noscript>Loading ... $feed[2]</noscript>\n";
Specialised Responses to Unusual Browsers
This all works well, with browsers tested including Opera, Firefox, Internet Explorer, Safari and Chrome; with and without Javascript enabled. The only browser that appears to have difficulties with this refresh technique is Lynx, which treats refresh URLs specially and asks, in this instance, supercilious questions of them. A small fragment of code deals with this:
if (preg_match ('/Lynx/', $_SERVER['HTTP_USER_AGENT']))
$noscript = YES; // Lynx is special. Keep the Faith.
The casual reader should note that the author normally disapproves of
adjusting web-server behaviour based on User-Agent strings. He is also
noted for the use of YES and NO in Boolean
situations, as may be observed in various code fragments found in this
Note. It is felt that this is easier to read, but requires this:
define ("YES", true);
define ("NO", false);
A full working example of code using these techniques is available from the author.
Presentation using CSS
The HTML derived from the RSS XML is presented using Cascading
Style Sheet 'navigation' tag. Normally this is used to provide
navigation side-bars, such as may be found with this Technical Note,
but by placing them in isolation, after a <br
clear=all>, they should appear to be a series of narrow
columns which should arrange themselves side-by-side depending
on the page width. Such multi-column formatting seems traditional on
news sites, even if it is not achieved in this way. The CSS code looks
like this:
/* Navigation is a bunch of ancillary text that should float to the right
as a separate column is there's space, and stay at the bottom (or
wherever it's put) if not. */
#navigation {
float: left;
margin-top: 15px;
margin-right: 2%;
margin-left: 2%;
max-width: 245px;
padding: 10px;
background: ivory;
border-style: dotted;
border-color: grey;
border-width: 1px
} /* #navigation */
#navigation img {
float: left;
padding: 2px
} /* img */
Discussion
The mechanisms behind the Cockermouth RSS[cr] and
related Generalised RSS Example
pages have been discussed. This approach can equally well be used in
more conventional Portal based sites. A typical example would involve
the selection of data for a customer from a variety of databases. It
could then be displayed asynchronously as it was presented by the remote
servers. Such systems are likely to be closed and thus Javascript can be
trusted. In such trusted systems there would be no need to provide a
non-Javascript alternative.
Updates
As with other Technical Notes, feedback is actively solicited. The
author may be contacted via the e-mail address found on his public
biography page[au]. Should something require changing
or enhancing then the fact will be acknowledged with attribution in this
Update section.
References
References found in this section, and in particular the HTML links were correct at time of writing (2011-07-01).
- [au] Martin Sullivan:
- http://www.zois.co.uk/people/martin_sullivan
- [jp] Java Portlet Specification - JSR 168:
- http://developers.sun.com/portalserver/reference/techart/jsr168
- [nj] The Unofficial National Jobcentre Plus Mirror:
- http://home.zois.co.uk/jcpnational.html
- [rs] Real Simple Syndication:
- http://www.techxtra.ac.uk/rss_primer
- [ph] PHP Hypertext Prepocessor:
- http://www.php.net
- [cs] Cascading Style Sheets:
- http://www.w3.org/Style/CSS
- [hr] XMLHttpRequest:
- http://www.w3.org/TR/XMLHttpRequest
- [js] Javascript:
- http://www.ecmascript.org
- [qd] TN-2009-11-15 Quick and Dirty Ajax:
- http://www.zois.co.uk/tn/tn-2009-11-15.html
- [cr] Cockermouth RSS Asynchronous Portal Example:
- http://home.zois.co.uk/crsse.php
- [ax] Ajax:
- http://en.wikipedia.org/wiki/Ajax_(programming)
- [ah] AHAH - Asynchronous HTML and HTTP:
- http://microformats.org/wiki/rest/ahah
~Z~