Tutorial 3: ‘Fixing’ the application cache with an iframe

by Matt Andrews,

This is part 3 of a tutorial series on how to make an FT style offline web app. If you haven’t already, please read part 1 and part 2 first.

One of the most important aspects of the web is the URL, and the direct connection between a URL and a single item of content. Currently for a number of reasons we break this rule and the demo app delivers the whole experience through a single URL. To fix this we need to deal with the problem with the application cache which caused us to adopt a single URL in the first place.

As I mentioned in Tutorial 1, one of the ill thought out features of the HTML5 application cache is that the page which points to the manifest in its HTML tag – <html manifest="myappcache.manifest"> – will itself also be cached, whether you like it or not. This is a problem because each time a user arrives on the demo web app from a different URL (which in the case of the FT is several million articles) their browser will attempt to add that URL into the application cache, indefinitely until the application cache runs out of space. So we have to find a solution that will allow users to begin to use the demo app from any URL without littering their browser’s application cache with unwanted content.

Forks, stars, pulls and even issue reports all welcome on GitHub.com.

Experience the demo app working here.

Getting Started

Start by cloning (or downloading) the GitHub repository from Part 2.

git clone https://github.com/matthew-andrews/ft-style-offline-web-app-part-2.git

Requirements

  • Maintain compatibility with all previous browsers and devices
  • Allow the demo app to be loaded from any URL (except those beginning with api) in preparation of switching from hashtag URLs to real URLs.
  • Maintain control of what is in the application cache.

Preventing the application cache from storing masters with an iframe

We worked around this application cache problem on the Economist HTML5 app by using an IFRAME. The trick is rather than include a manifest attribute on the html tag of the page loaded, instead load an almost empty page (empty except for a little Javascript) that has the manifest attribute in its html tag in a hidden iframe. This means no matter which page your users load the demo web app, the browser will only store one unique master entry and therefore the files stored by the application cache are also always the same.

We then rely on FALLBACK rules inside the application cache manifest to ensure that every URL within the demo web app, including the URL the user initially loaded the web app on, can be loaded without an internet connection.

The new and changed files required in this tutorial:-

/source/appcache.js The code in the web app responsible for loading the iframe
/api/resources/index.php Adding the above file into the web app Javascript; also setting a new app global APP_ROOT which will be the absolute path to the root of the web app.
/offline.manifest.php Changes required by the iframe solution and to prepare for the History API (coming in Tutorial 4)
/manifest.html The file to be loaded by the iframe
/api/offline/index.html A fallback for any request starting /api when no connection is available. All other requests will fallback to the root file (index.php).
/source/applicationcontroller.js Updating the web app to use the new iframe application cache solution
/source/articles/articlescontroller.js Updating the path to the articles api call to make use of the APP_ROOT global – we can no longer rely on relative paths because the subfolder of the page the user is viewing can vary.
/index.php (renamed from index.html) Preventing the bootstrap (discussed in Tutorial 1) from using the old, non-iframe based offline caching solution as well as changing all the relative paths to absolute paths.
/.htaccess A simple mod rewrite to route all requests not beginning with /api to index.php.

/source/appcache.js
This file will take care of informing the user that the website is capable of working offline and will request permission to do so. This is a good idea, because it means we can control at least the first part of the offline permission prompt experience and prepare the user for any odd things the browser may do.

Once it has permission this code will also be responsible for managing the iframe – both adding it to the DOM and removing it once the application cache is populated.

You might also have noticed on line 17 we are making use of a new app-wide global variable APP_ROOT. This will be set in api/resources/index.php and will be the path to the root of the web app (simply “/” if it is sitting at the top of a domain, like app.ft.com, or the path to subfolder that contains the demo app)

Huge amount of credit to George Crawford, lead developer of The Economist HTML5 app and FT columnflow, for this and manifest.html that I’ve butchered for the purposes of this tutorial.

APP.appcache = (function () {
	'use strict';

	var statuses = {
		"-1": 'timeout',
		"0": 'uncached',
		"1": 'idle',
		"2": 'checking',
		"3": 'downloading',
		"4": 'updateready',
		"5": 'obsolete'
	}, offlineEnabled;

	function innerLoad() {
		var iframe = document.createElement('IFRAME');
		iframe.setAttribute('style', 'width:0px; height:0px; visibility:hidden; position:absolute; border:none');
		iframe.src = APP_ROOT + 'manifest.html';
		iframe.id = 'appcacheloader';
		document.body.appendChild(iframe);
	}

	function logEvent(evtcode, hasChecked) {
		var s = statuses[evtcode], loaderEl;
		if (hasChecked || s === 'timeout') {
			if (s === 'uncached' || s === 'idle' || s === 'obsolete' || s === 'timeout' || s === 'updateready') {
				loaderEl = document.getElementById('appcacheloader');
				loaderEl.parentNode.removeChild(loaderEl);
			}
		}
	}

	function requestOffline() {
		return confirm("This website is capable of working offline. Would you like to enable this feature?");
	}

	function start() {
		if (offlineEnabled !== true && offlineEnabled !== false) {
			offlineEnabled = requestOffline();
			if (offlineEnabled) {
				localStorage.offlineEnabled = true;
			}
		}
		if (offlineEnabled === true) {
			innerLoad();
		}
	}

	// If offline mode already enabled, run innerLoad
	offlineEnabled = localStorage.offlineEnabled;

	if (offlineEnabled !== undefined) {
		offlineEnabled = (offlineEnabled === "true");
	}

	return {
		start: start,
		logEvent: logEvent
	};
}());

/api/resources/index.php
We need to add appcache.js into the application Javascript (the new line is line 14).

So that the web app can work from inside a subfolder we also need to set the new global variable, APP_ROOT, which I introduced earlier.

<?php
// Concatenate the files in the /source/ directory
// This would be a sensible point to compress your Javascript.
$js = '';
$js = $js . file_get_contents('../../libraries/client/fastclick.js');
$js = $js . 'window.APP={}; (function (APP) {';
$js = $js . file_get_contents('../../source/application/applicationcontroller.js');
$js = $js . file_get_contents('../../source/articles/articlescontroller.js');
$js = $js . file_get_contents('../../source/articles/article.js');
$js = $js . file_get_contents('../../source/datastores/network.js');
$js = $js . file_get_contents('../../source/datastores/indexeddb.js');
$js = $js . file_get_contents('../../source/datastores/websql.js');
$js = $js . file_get_contents('../../source/templates.js');
$js = $js . file_get_contents('../../source/appcache.js');
$js = $js . '}(APP)),';

// Detect and set the absolute path to the root of the web app
// First get a clean version of the current directory (will include api/resources)
$appRoot = trim(dirname($_SERVER['SCRIPT_NAME']), '/');

// Strip of api/resources from the end of the path
$appRoot = trim(preg_replace('/api\/resources$/i', '', $appRoot), '/');

// Ensure the path starts and ends with a slash or just / if on the root of domain
$appRoot = '/' . ltrim($appRoot . '/', '/');

$js = $js . 'APP_ROOT = "' . $appRoot . '";';

$output['js'] = $js;

// Concatenate the files in the /css/ directory
// This would be a sensible point to compress your css
$css = '';
$css = $css . file_get_contents('../../css/global.css');
$output['css'] = $css;

// Encode with JSON (PHP 5.2.0+) & output the resources
echo json_encode($output);

/offline.manifest.php
There’s only one change to the manifest file. I’ve added a FALLBACK section. The FALLBACK section of the HTML5 application cache uses basic pattern matching (something like Apache’s .htaccess’ mod rewrite) to provide an offline response for resources that aren’t specifically cached in the application cache. In our example, we don’t want to list every single article we have ever published (because there are too many) – but even if we are offline, and therefore can’t display an article the user may have requested, we’d like the ability to display a branded “Article not found” error instead.

Under these new fallback rules, if a user does not have an internet connection and navigates to any URL that doesn’t begin with /api the browser should* return (“fallback to using”) the root page of our web app. Any URL beginning with /api should* get api/offline – which is just a file that contains the single word “offline”.

* With the exception of Internet Explorer 10. Although Internet Explorer 10 does support fallbacks, it only supports them for subresources – it can’t load a page via a fallback directly if the application cache doesn’t explicitly have that page cached offline.

<?php
header("Content-Type: text/cache-manifest");

// Detect the demo app root (taken from api/resources/index.php)
$appRoot = trim(dirname($_SERVER['SCRIPT_NAME']), '/');
$appRoot = '/' . ltrim($appRoot . '/', '/');
?>
CACHE MANIFEST
# 2012-10-29 v1
jquery.min.js

FALLBACK:
<?php echo $appRoot; ?>api <?php echo $appRoot; ?>api/offline/
<?php echo $appRoot; ?> <?php echo $appRoot; ?>


NETWORK:
*

/manifest.html
This is the file that will be included by the demo app in the iframe. It intentionally has no content to minimise its size (it will count towards the application cache’s storage limit) and because the contents of the iframe will never be shown to the user. This is only file in the project that will have a manifest attribute set in the html tag.

We have also implemented some basic Javascript event listeners to listen for application cache events in order to pass notifications up to parent (the page in which the iframe is contained).

<!DOCTYPE html>
<html lang="en" manifest="offline.manifest.php">
	<head>
		<script type="text/javascript" src="jquery.min.js"></script>
		<script type="text/javascript">
			$(document).ready(function () {
				'use strict';

				var checkTimer, status, hasChecked, loopMax = 60;

				function check() {
					if (applicationCache.status === applicationCache.CHECKING
							|| applicationCache.status === applicationCache.DOWNLOADING
							|| applicationCache.status === applicationCache.UPDATEREADY) {
						hasChecked = true;
					}
					if (applicationCache.status !== status) {
						status = applicationCache.status;
						parent.APP.appcache.logEvent(status, hasChecked);
					}
					loopMax = loopMax - 1;
					if (loopMax > 0) {
						if (checkTimer) {
							clearTimeout(checkTimer);
						}
						setTimeout(check, 1000);
					} else {
						parent.APP.appcache.logEvent(-1, hasChecked);
					}
				}

				if (parent.APP) {
					$(applicationCache).bind('updateready cached checking downloading error noupdate obsolete progress updateready', check);
					setTimeout(check, 250);
				}
			});
		</script>
	</head>
	<body></body>
</html>

/api/offline/index.html
This is the page that the application cache will return for any api request if the network request fails. The reason why we have to have to have a specific api fallback is so that we don’t accidentally allow the application cache to return the bootstrap html page in response to our api requests.

offline

/source/applicationcontroller.js
There is only a tiny change required in the application controller, which to run the APP.appcache.start(); method within the initialize method, which will ask permission from the user to enable the demo app to load offline and enable that feature if the user says yes. See line 26 for the new code required.

... etc ...

 function initialize(resources) {

        // Listen to the hash tag changing
        if ("onhashchange" in window) {
            $(window).bind("hashchange", route);
            
        // Support for old IE (which didn't have hash change)
        } else {
            (function () {
                var lastHash = window.location.hash;
                window.setInterval(function () {
                    if (window.location.hash !== lastHash) {
                        lastHash = window.location.hash;
                        route();
                    }
                }, 100);
            }());
        }

        // Set up FastClick
        fastClick = new FastClick(document.body);

        // Initalise appcache
        APP.appcache.start();	

        // Inject CSS Into the DOM
        $("head").append("<style>" + resources.css + "</style>");

        // Create app elements
        $("body").append(APP.templates.application());

        // Remove our loading splash screen
        $("#loading").remove();

        route();
    }

... etc ...

/source/articles/articlescontroller.js

Again a very small change (line 6) to change url: 'api/articles' to url: APP_ROOT + 'api/articles/'.

... etc ...

    function synchronizeWithServer(failureCallback) {
        $.ajax({
            dataType: 'json',
            url: APP_ROOT + 'api/articles/',
            success: function (articles) {
                APP.article.deleteArticles(function () {
                    APP.article.insertArticles(articles, function () {
                        /*
                         * Instead of the line below we *could* just run showArticeList() but since
                         * we already have the articles in scope we needn't make another call to the
                         * database and instead just render the articles straight away.
                         */
                        $("#headlines").html(APP.templates.articleList(articles));
                    });
                });
            },
            type: "GET",
            error: function () {
                if (failureCallback) {
                  failureCallback();
                }
            }
        });
    }

... etc ...

/index.php (renamed from index.html)
Because this file will no longer just be able to be loaded from the root of the demo app (it will also be served from any request to the web app – except for requests starting with /api) it has to be know where the demo app’s root is so that it can load jQuery (see line 10) and the demo app’s resources (see line 55).

<?php
// Detect the demo app root (taken from api/resources/index.php)
$appRoot = trim(dirname($_SERVER['SCRIPT_NAME']), '/');
$appRoot = '/' . ltrim($appRoot . '/', '/');
?>
<!DOCTYPE html>
<html lang="en">
	<head>
		<meta name="viewport" content="width=device-width,initial-scale=1.0,maximum-scale=1.0,minimum-scale=1.0,user-scalable=no" />
		<script type="text/javascript" src="<?php echo $appRoot; ?>jquery.min.js"></script>
		<script type="text/javascript">
			$(document).ready(function () {

				var APP_START_FAILED = "I'm sorry, the app can't start right now.";
				function startWithResources(resources, storeResources) {

					// Try to execute the Javascript
					try {
						eval(resources.js);
						APP.applicationController.start(resources, storeResources);

					// If the Javascript fails to launch, stop execution!
					} catch (e) {
						if (typeof console !== "undefined") {
							console.log(e);
						}
						alert(APP_START_FAILED);
					}
				}
				function startWithOnlineResources(resources) {
					startWithResources(resources, true);
				}

				function startWithOfflineResources(e) {
					var resources;

					// If we have resources saved from a previous visit, use them
					if (localStorage && localStorage.resources) {
						resources = JSON.parse(localStorage.resources);
						startWithResources(resources, false);

					// Otherwise, apologize and let the user know
					} else {
						alert(APP_START_FAILED);
					}
				}

				// If we know the device is offline, don't try to load new resources
				if (navigator && navigator.onLine === false) {
					startWithOfflineResources();

				// Otherwise, download resources, eval them, if successful push them into local storage.
				} else {
					$.ajax({
						url: '<?php echo $appRoot; ?>api/resources/',
						success: startWithOnlineResources,
						error: startWithOfflineResources,
						dataType: 'json'
					});
				}

			});
		</script>
		<title>News</title>
	</head>
<body>
	<div id="loading">Loading&hellip;</div>
</body>
</html>

/.htaccess
Finally, use Apache’s htaccess’s mod_rewrite (assuming you are using an Apache server with this feature enabled) to route every request to index.php except requests for specific files or folders or requests to the api.

<IfModule mod_rewrite.c>
        RewriteEngine On
        RewriteCond %{REQUEST_FILENAME} !-f
        RewriteCond %{REQUEST_FILENAME} !-d
        
        # Match everything not under /api/ to index.php
        RewriteRule !^api/. index.php [L]
</IfModule>

Wrapping Up

Users can now load the demo app from any URL – ie. where previously users could only access the demo app from the root of our web app, which for example could be example.com/path/to/app/, they can now access it from, say, example.com/path/to/app/any-article or example.com/path/to/app/subfolder/2012/11/01/another-article and we stay in complete control of what is stored inside our user’s application cache and we don’t break our the rules of what to store in the application cache, which we set out in Tutorial 1.

But at the moment the demo app isn’t aware of there being any difference between these URLs so it will always return the same content. In next month’s article we will implement the History API – which will allow us to remove hash tag URLs and be one step closer to getting the initial load of app rendered on the server, use real URLs and allow the demo app to be crawlable by search engines.

If you think you’d like to work on this sort of thing and live (or would like to live) in London, we’re hiring!

By Matt Andrews – @andrewsmatt on Twitter & Weibo.

Continue to part 4 – Putting the web back into web app